1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox: Google, Yahoo and Bing question

Discussion in 'Black Hat SEO' started by MrGr33n, Mar 2, 2015.

  1. MrGr33n

    MrGr33n Regular Member

    Joined:
    Oct 9, 2014
    Messages:
    225
    Likes Received:
    42
    When I harvest urls, should I only be using google or yahoo and bing as well. I bought 10 semi ded proxies and I can't harvest anymore urls with google after harvesting about 5000. I checked my proxies and they all work so I am confused what is going on? Out of curiosity I tried yahoo and that didnt work too, but Bing works extremely well.

    My settings are 1 connection with Google and 3 for both yahoo and bing. RND timeout is 5 secs.

    Any advice would be much appreciated. Thanks.
     
    Last edited: Mar 2, 2015
  2. MrGr33n

    MrGr33n Regular Member

    Joined:
    Oct 9, 2014
    Messages:
    225
    Likes Received:
    42
    So I unchecked use multi threaded harvester and I'm getting error 302 (IP blocked). I only used my semi dedicated proxies for a few minutes? Why is this happening?
     
  3. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,726
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    Your proxies are blocked. Firstly you need to understand that 1 connection with 10 proxies on google no longer works. Its more like 1 connection with 20+ proxies for basic keywords and 1 connection with 30-50+ proxies for advanced operators.

    Next up, yahoo and bing aren't that far behind. I believe yahoo changed something recently that broke the harvester in the multi and single threaded harvesters in 1.x. So you could use the custom harvester and update the engines file.

    Aside from that the custom harvester gives you 20+ engines to harvest from.

    Further more on that note you should try Scrapebox 2.0, its highly optimized for Scraping. Ive got over a million urls per minute from google (lots of proxies required) but anyway, there are loads of engines, try deeper web, they are less ban prone and pull results from google.

    http://www.scrapebox.com/v2-beta
     
    • Thanks Thanks x 1
  4. MrGr33n

    MrGr33n Regular Member

    Joined:
    Oct 9, 2014
    Messages:
    225
    Likes Received:
    42
    Hi Loopline thank you for the reply. I should mention that I joined here recently as a noob and got bit by the seo bug. I have been devouring as much information as I can and I am a great fan of your videos. Thank you for them.

    I noticed your recent 'Can Scrapebox 2.0 harvest 1 million urls per minute from Google?' but didn't watch it because I scrolled all the way down to the bottom to start from the first one. I did however check what version 2.0 was and saw:

    The fact that you can use server proxies is amazing. However like in my OP I am unsure if anything else but Google should be used for harvesting urls, because in the videos and tutorials that I read and saw, only google is ticked and also in GSA SER they only use google engines. I am using scrapebox to scrape urls to use in GSA, to try and rank a parasite in Google. Does it make a difference what search engine I use to scrape the urls from in Scrapebox? If not then isn't Scrapebox 2.0 the best thing ever and why doesn't everyone jump on it?
     
    Last edited: Mar 3, 2015
  5. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,726
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    Glad you liked the videos and found them helpful. Yes server proxies are excellent.

    Its irrelevant what engine you scrape the urls from, google will find them, they are fundamentally good at that very thing, crawling and indexing the web, I dare say no one is better. So scrape from any engine, and by all means the others are there to be used, but too many people get in a rut and think google, google, google, google is king, google is all there is, google drool.... google. lol Think outside the box.

    Scrapebox 2.0 is all that and the best thing, but people don't jump on it because they are too busy reading misinformation, worrying about it, trying to find a way around it, but never testing to see if its true. They you have the ones seeking the money tree that doesn't exist.

    My advice is jump in and just get your hands and feet dirty, test everything learn all you can. There is probably 10 times as much bad info out there as there is good, so you have to decipher it and sometimes the only way is to test it, besides thats how you figure out what works and doesn't anyway.

    For example, it wasn't all that long ago all the prophets of doom were saying that blog networks were dead and seo is dead and google is deindexing sites and all this. At the same time Ive got customers flipping out about only wanting ******** high pr links and not wanting nofollow links and all sorts of nonsense. Meanwhile Dori Friend did a test and ranked a brand new site, in a top spot, only using links on DeIndexed Penalized Blog Network Domains.

    Basically When google says something, its probably either half true or not true at all, when people run around screaming the world is ending, its the best time to jump in and figure out what really does work, because something works, there is a new loophole, find it and exploit it. :)
     
    • Thanks Thanks x 1
    Last edited: Mar 3, 2015
  6. cholzie

    cholzie Newbie

    Joined:
    Mar 8, 2015
    Messages:
    2
    Likes Received:
    1
    loopline , what a great explanation that was!,
    i have been struggle using scrapebox (v1) few days ago,and most of that was about the proxy. i used and tried both public proxy and private,however the result was always pathetics.
    Then i think it must be something wrong,and i try to searching your video about scrapebox v2 on youtube,watched it and downloaded it soon after,
    After try for few minutes,i think i have found a gold now!
    Scrapebox V2 have big big improvement.everything about it is just great!

    MrGr33n
    and about the search engine you use for harvesting url,in my opinion that is doesnt really matter at all,
    every url indexed by google,i am pretty sure bing and yahoo have already indexed them too,
    so thats all really depend on your keywords and footprint.
    then if you using the list for gsa ser,just remember to create a great filtering,and you should be fine.
     
    • Thanks Thanks x 1
  7. dpers

    dpers Newbie

    Joined:
    Dec 22, 2014
    Messages:
    3
    Likes Received:
    1
    Loopline,

    I'm having some issues when scraping G with the 2.X (on other engines works perfectly). tested with proxies first, got red, then did the test without proxies, got red, did the same exact steps on the 1.x, working fine all come out green.

    what am i missing here (random bug)?

    also, the settings tab missing some important options and addons that the 1.X use to have, can you please shed some light on that?

    Thank you.