1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Can't harvest URLs with Scrapebox

Discussion in 'Black Hat SEO' started by IMpossible, Aug 19, 2013.

  1. IMpossible

    IMpossible Supreme Member

    Joined:
    Apr 15, 2012
    Messages:
    1,338
    Likes Received:
    302
    Occupation:
    Internet Marketing Guru
    Location:
    Somewhere on earth
    Hi guys,

    I'm having an issue with SB. When I harvest URLs and I leave it to "Custom Footprint", but don't enter a footprint, I get URLs for my keyword. However when I enter something or when I choose "Wordpress", I get 0 results. Can someone help me out?
     
  2. IMpossible

    IMpossible Supreme Member

    Joined:
    Apr 15, 2012
    Messages:
    1,338
    Likes Received:
    302
    Occupation:
    Internet Marketing Guru
    Location:
    Somewhere on earth
    Anyone?
     
  3. kkvsam

    kkvsam Senior Member

    Joined:
    Oct 11, 2009
    Messages:
    936
    Likes Received:
    569
    Occupation:
    SYS ADMIN
    Home Page:
    May be proxies? Currently I don't have any issue.
     
  4. IMpossible

    IMpossible Supreme Member

    Joined:
    Apr 15, 2012
    Messages:
    1,338
    Likes Received:
    302
    Occupation:
    Internet Marketing Guru
    Location:
    Somewhere on earth
    I'm using 20 google passed public proxies... still the same issue.. what can I do?
     
  5. cryptons

    cryptons Jr. VIP Jr. VIP Premium Member

    Joined:
    Jun 12, 2013
    Messages:
    1,218
    Likes Received:
    447
    Occupation:
    Developer,social marketer
    use private proxies or restart your pc!
     
  6. IMpossible

    IMpossible Supreme Member

    Joined:
    Apr 15, 2012
    Messages:
    1,338
    Likes Received:
    302
    Occupation:
    Internet Marketing Guru
    Location:
    Somewhere on earth
    What's the point in restarting my PC? I don't think it has anything to do with that...
     
  7. Rua999

    Rua999 Power Member

    Joined:
    Jun 25, 2011
    Messages:
    630
    Likes Received:
    407
    Maybe your using advanced operators like inurl: etc and google has temporarily blocked your proxies from scraping with those proxies (your proxies still show as google passed when this happens).

    When im scraping with these type footprints i generally have it set that if i have 20 private proxies i'll have it set to scraping at 2 threads at a time.. 30 proxies.. 3 threads at a time.. 40 proxies.. 4 threads and so on..

    Your proxies might be temporarily burned, it happened to me just a couple of days ago aswel.. i just bought more proxies and added them to the mix and everything's rosy again.
     
    • Thanks Thanks x 1
  8. RedMango

    RedMango Power Member

    Joined:
    Jul 15, 2010
    Messages:
    518
    Likes Received:
    201
    Location:
    UK
    I'm getting this same error. I have private proxies and have tried various footprints. I have been harvesting 100ks over the past few weeks. My proxies get refreshed every month and the new ones aren't working, as last month's weren't either. Can't figure it out.
     
  9. IMpossible

    IMpossible Supreme Member

    Joined:
    Apr 15, 2012
    Messages:
    1,338
    Likes Received:
    302
    Occupation:
    Internet Marketing Guru
    Location:
    Somewhere on earth
    The Wordpress footprint doesn't even work with private proxies? Man.... that's some serious issue...
     
  10. dinkish

    dinkish Power Member

    Joined:
    Apr 19, 2013
    Messages:
    689
    Likes Received:
    159
    The only time I had an issue I got a 503 error code or something similar that was visible on the bottom right of the window.
     
  11. IMpossible

    IMpossible Supreme Member

    Joined:
    Apr 15, 2012
    Messages:
    1,338
    Likes Received:
    302
    Occupation:
    Internet Marketing Guru
    Location:
    Somewhere on earth
    Do you use private proxies when harvesting?
     
  12. RedMango

    RedMango Power Member

    Joined:
    Jul 15, 2010
    Messages:
    518
    Likes Received:
    201
    Location:
    UK
    I think my proxies are failing or my IP is banned or something. Need to speak to ISP....my IP is blacklisted with a few checkers, I've just noticed...
     
  13. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,368
    Likes Received:
    1,797
    Gender:
    Male
    Home Page:
    Go to settings menu and untick the following:

    use custom harvester
    use mutli threaded harvester

    Then try to harvest for the queries your having problems with, what error codes do you get in the status column? What engines are you harvesting from?

    The status codes combined with the engines, if you can post them here, will answer your question and allow me to help direct you.

    Also I have a video that covers this, its here:

    [video=youtube_share;2QaLWgTXsRo]http://youtu.be/2QaLWgTXsRo[/video]
     
    • Thanks Thanks x 2
  14. testdrive

    testdrive Regular Member

    Joined:
    Sep 18, 2008
    Messages:
    443
    Likes Received:
    173
    Some search operators like inurl, intitle, etc can block your proxy very fast for the operator you are using. as Rua999 said your proxy can still pass google check but it will block for scraping if you use the said operators.
     
  15. IMpossible

    IMpossible Supreme Member

    Joined:
    Apr 15, 2012
    Messages:
    1,338
    Likes Received:
    302
    Occupation:
    Internet Marketing Guru
    Location:
    Somewhere on earth
    Wow loopline, didn't know you were active on BHW. Just watched all of your videos. :)
     
    • Thanks Thanks x 1
  16. seocrab

    seocrab Senior Member

    Joined:
    May 2, 2013
    Messages:
    969
    Likes Received:
    825
    Occupation:
    seo freelancer
    Location:
    UK
    Thanks that that, I was having the same problem.
     
  17. mindlesswizard

    mindlesswizard Supreme Member

    Joined:
    Sep 3, 2010
    Messages:
    1,359
    Likes Received:
    282
    Occupation:
    Designer/Developer, Internet Marketer
    Location:
    in the shade of Everest
    well i am having the same problem in harvesting. Well, this is what i missed. Thanks for the info :) gotta check it
     
  18. dannyhw

    dannyhw Senior Member

    Joined:
    Jul 16, 2008
    Messages:
    980
    Likes Received:
    462
    Occupation:
    Software Engineer
    Location:
    New York City Burbs
    It's been a while since I've done anything like this, but if I'm remembering correctly there would be situations where I got hit with a captcha really fast with certain footprints, but if I did a search without one it wouldn't hit me with one until I tried a footprint again.

    Use your proxies in a browser and try a G search with a footprint?

    It's been even longer since I used SB, but did there used to be something in there that would queue up your captcha'd proxies and not use them until you solved it? Was that something else? The only off the shelf software I ever actually liked is SB so I think it was that.
     
  19. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,368
    Likes Received:
    1,797
    Gender:
    Male
    Home Page:
    It was scrapebox, but due to various reasons scrapebox will no longer enter captchas to unblock proxies. As you noted it does have different ban levels anyway, google will block for 1 type of query but not another.

    The most ideal thing I have found is to just use private proxies and set your connections appropriately, and I harvest non stop for days and days and my proxies never get blocked. Its about finding the sweet spot of how fast you can get away with going and not actually get blocked. Usually that means for every 10 proxies you set 1 connection for basic keywords and a ratio of 1:15 or more for advanced operators.

    So if you had 30 proxies and you want to do advanced operators set connections to 2, etc...