1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox Harvester Help

Discussion in 'Black Hat SEO Tools' started by Divvy, Aug 23, 2016.

  1. Divvy

    Divvy Junior Member

    Joined:
    Aug 24, 2011
    Messages:
    165
    Likes Received:
    10
    Hello guys,

    Does someone here uses Scrapebox that can give me a little help here?
    I'm using 30 proxies and I'm trying to scrape using some keywords.
    But the scraping is soooo slowwwww...

    Please take a look:
    [​IMG]

    Average urls: 1
    What could be? Any ideas?

    This is my settings:

    My Internet connection: 100mb/s

    Proxy Manager: 30 - Timeout: 10 sec
    Proxy Harvester: 50 - Timeout: 10 sec
    Url Harvester: 9 - Timeout: 10 sec

    [​IMG]

    Help! :)
     
  2. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,876
    Likes Received:
    2,059
    Gender:
    Male
    Home Page:
    You have 30 proxies and your running 9 connections.

    So here is an oxymoron, turn the connections down if you want to go faster.

    Because at this point all your ips are banned, which is why its going so slow. You have over 100K errors with google.

    I have a video that should help, you need to use the detailed harvester with a sizeable delay.

     
    • Thanks Thanks x 2
  3. Divvy

    Divvy Junior Member

    Joined:
    Aug 24, 2011
    Messages:
    165
    Likes Received:
    10
    Hey loopline, thank you for your reply and help :)

    I changed my harvest connection from 9 to 1
    And I set 3 seconds delay in detailed harvest
    I'm using 30 proxies, do you think is a good setting or should I change them?
     
  4. Divvy

    Divvy Junior Member

    Joined:
    Aug 24, 2011
    Messages:
    165
    Likes Received:
    10
    It doesnt work very good with this settings.

    Please help... what connection should I put in Harvest settings and what delay in seconds?
    I'm using 30 proxies.

    Thank you guys!
     
  5. Divvy

    Divvy Junior Member

    Joined:
    Aug 24, 2011
    Messages:
    165
    Likes Received:
    10
    30 proxies
    1 harvest connection
    10 seconds delay
    it results on proxies ban....

    Please advice, what delay should I try?
    Is the ban permanent? Or temp?
     
  6. Kay

    Kay Newbie

    Joined:
    Aug 25, 2016
    Messages:
    24
    Likes Received:
    4
    Gender:
    Male
    mmmm - that is extremely slow. I don't think I have experienced that. Have you sent them a message/spoken to the company?
     
  7. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,876
    Likes Received:
    2,059
    Gender:
    Male
    Home Page:
    Bans are not permanant. But try a 120 delay.

    I know of 1 person that had 100 proxies and had to do a 5 minute delay because of the query he was using. So once your proxies pass try 120. If it doesn't work, go up more on delay. If it does work fine, start shaving the delay down to the point it gets blocked. Then go back up a little and you will know where the sweet spot is.

    No one can tell you exactly, it depends on the query type, how many times and how often the proxies have been blocked before, etc... too many variables to predict, but you can test it and find out
     
  8. cashcorp

    cashcorp Jr. VIP Jr. VIP

    Joined:
    Feb 8, 2008
    Messages:
    477
    Likes Received:
    283
    Home Page:
    You should look into mass-scraping public proxies and Google-testing them as well. We do almost all of our scraping using harvested, public proxies. Enough of those, and you can ramp up the connections significantly.

    Yeah, your going to miss quite a bit due to proxies going down, timeouts, etc.. but the increased volume more than makes up for it.
     
  9. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,876
    Likes Received:
    2,059
    Gender:
    Male
    Home Page:
    Also don't forget google isn't the only engine. I use google, but I get a lot of great results from bing with way less work. Plus there is over 20 engines in Scrapebox by default
     
  10. Divvy

    Divvy Junior Member

    Joined:
    Aug 24, 2011
    Messages:
    165
    Likes Received:
    10
    Thank you for your replies and help guys! :)

    cashcorp, whats your source? Can you share it? :)

    loopline, what delay do you suggest for bing only with 30 proxies?
    I'm trying with 10 delay but is still very slow... do you think I get good results with 3 for example?

    Thanks!!
     
  11. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,876
    Likes Received:
    2,059
    Gender:
    Male
    Home Page:
    Try and see, it just varies. It may work fine, depends on history of the proxies, the query etc.. Worst case you get banned, wait 12-48 hours and try again.
     
  12. I2on4ld

    I2on4ld Junior Member

    Joined:
    May 28, 2016
    Messages:
    112
    Likes Received:
    17
    Gender:
    Male
    I just scrape with detailed harvester and harvesting yahoo and bing only.. i still got millions urls.

    And also i like to ask @loopline, i scrape a grabbed comments from my previous AA list to expand my AA list, so i grabbed all comments from previous AA list > merge those grabbed comments as keywords with quote "%kw%", use detailed harvester to scrape only few platform (asp blog, blog engine and wordpress) and harvest on yahoo and bing only.. after several hours run, i only got good result from bing, from 1 keyword mostly i can harvest until page 15-20 on bing.. but 0 result on yahoo.

    Question: do keywords merged with quote, detailed harvester doesnt work anymore on yahoo? Or some setting on my scrapebox is wrong?

    Thanks
     
  13. cashcorp

    cashcorp Jr. VIP Jr. VIP

    Joined:
    Feb 8, 2008
    Messages:
    477
    Likes Received:
    283
    Home Page:
    @l2on4ld

    Dedupe your results. You either did not get quite that many unique results, or your footprints/filtering are terrible and your going to pay for it later on ;)

    As to your question, are you running the latest updated version of Scrapebox? I've noticed Yahoo tends to stop working every few weeks, until I grab an update. For whatever reason I think they might block scraping more actively than Bing (Even though they have literally identical results.)
     
  14. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,876
    Likes Received:
    2,059
    Gender:
    Male
    Home Page:
    I think you are overkill specific here. Didn't sleep much last night so not sure if that makes sense.

    You scraped comments and merged them with keywords and then also with footprints,, your limiting your own results.

    Scrape comments, then scrape engines.

    Scrape comments and then merge footpritns and scrape engines

    Scrape comments and then merge keywords and scrape engines

    but not keywords and comments and engines.

    Some comments will have a LOT of results, other won't have that many, so by tthe time you add keywords or footprints you are already narrowing results, but tack on both of those and its a bit much.

    You can get a loose idea with the google competition finder, which has nothing to do with yahoo or bing, but it could work, or just take 15 of those combos and look at them in a browser with yahoo and bing. I always find it invalueable to do things manually in a browser, I learn so much.

    I think it was bill gates (Microsoft, you know) that said scaling an efficient system only scales the inefficiency. So you have to dial it in before you go automating/scaling it with scrapebox. Else your just pushing buttons hoping the money tree will dump money on you, which is not likely to happen.
     
  15. I2on4ld

    I2on4ld Junior Member

    Joined:
    May 28, 2016
    Messages:
    112
    Likes Received:
    17
    Gender:
    Male
    Im not merging comments with kw, im merging comments with quote "%kw%" and footprints..

    So if the comment:
    this is very nice blog

    I merge it with "%kw%" so it become:
    "This is very nice blog"

    And use that as keyword for harvesting.
    And choose only 4-5 platform to harvest..

    So in example, SB will harvest:
    "Speak your mind"+"this is very nice blog"

    On bing i got many result from that, sometimes it harvesting 15-20 pages.

    But on yahoo.. result only 1 or sometimes 0.

    I use latest version of sb, i always upgrade to latest version when it avalable to download.

    Thanks
     
  16. redarrow

    redarrow Elite Member

    Joined:
    Apr 1, 2013
    Messages:
    5,977
    Likes Received:
    1,445
    Divvy are you using a vps or at home.

    If at home speed is subject to home soeed and proxies speed .

    What the speed of your proxies?

    Are your proxies private or shered?