1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox Harvester not working?

Discussion in 'Black Hat SEO Tools' started by Rogar, Feb 10, 2012.

  1. Rogar

    Rogar Newbie

    Joined:
    Nov 6, 2011
    Messages:
    49
    Likes Received:
    4
    Hi,

    I bought scrapebox the other day (so it's probably something to do with me not the tool) and I've been trying every way to use it to find forums (I'm trying to learn more about the custom footprint capabilities.

    However, I find fresh public proxies, all working fine.
    I go to harvester part
    click "Custom Footprint"
    I enter "powered by phpbb"
    I click "Start Harvester"

    And AOL always comes back with a 403 error. Google says completed and either finds 100, 200 or 0 and Yahoo (on this occasion) found 400 then reports an error 999.

    I've been trying different custom footprints for hours with no joy, I've read forums/watch videos etc and I can't see what I am doing wrong? I've disablled AV (but that actually made SB find less lol). I've restarted PC, CCleaned.

    Your help would be GREATLY appreciated as it is something I'm probably doing wrong.

    Rogar.
     
  2. eatingMemory

    eatingMemory Jr. VIP Jr. VIP

    Joined:
    Mar 23, 2010
    Messages:
    1,302
    Likes Received:
    189
    Your proxies which you found on forums are probably banned on the search engines. They might still be working hence you see them fine in Scrapebox but not to the search engines. Try buying a few private proxies and you can scrape some stuff continuously =)
     
  3. Rogar

    Rogar Newbie

    Joined:
    Nov 6, 2011
    Messages:
    49
    Likes Received:
    4
    Ahh Ok, I am planning on buying some proxies tonight however, I've been hearing scrape with public, post with private so was trying to scrape but that wasn't working lol.

    Ok buying in a bout 20 minutes and will report back for others who might be in a similar boat.

    Thanks for reply.
     
  4. endgeek

    endgeek Regular Member

    Joined:
    Jan 7, 2012
    Messages:
    245
    Likes Received:
    117
    Occupation:
    Web- & Graphicdesign
    Location:
    Germany
    Google 0, Yahoo 0, AOL 60 Results..

    If I put this footprin in google and hit enter, it needs at last 5 seconds to get any results..

    Maybe a new filter method? o.0
     
  5. Rogar

    Rogar Newbie

    Joined:
    Nov 6, 2011
    Messages:
    49
    Likes Received:
    4

    Bought my bloomin private proxies now lol.
     
  6. endgeek

    endgeek Regular Member

    Joined:
    Jan 7, 2012
    Messages:
    245
    Likes Received:
    117
    Occupation:
    Web- & Graphicdesign
    Location:
    Germany
    I think, if its rly a filter by google, there comes a fix within the next days..
     
  7. Rogar

    Rogar Newbie

    Joined:
    Nov 6, 2011
    Messages:
    49
    Likes Received:
    4
    Do you know how I can report it to the scrape box team? Like I say I've only had the s/w about a day or so and am not aware of correct procedures.

    Thanks,
    Rogar
     
  8. endgeek

    endgeek Regular Member

    Joined:
    Jan 7, 2012
    Messages:
    245
    Likes Received:
    117
    Occupation:
    Web- & Graphicdesign
    Location:
    Germany
  9. Scritty

    Scritty Elite Member Premium Member

    Joined:
    May 1, 2010
    Messages:
    2,807
    Likes Received:
    4,496
    Occupation:
    Affiliate Marketer
    Location:
    UK
    Home Page:
    Add some detail here chaps.

    I've been scraping all day with custom footprints without any issue.

    However - I will say you can burn out provate proxies real quick with scraping. I use 4 proxies per thread for scraping (so in my case 100 private proxies - 25 threads) If you've got 10 or 12 proxies, I'd keep the thread number for scraping REAL low (3?) and set it so it swap proxies after every scrape.

    Even with three proxies it's possible to get 50+ URL's a second, so it's not as big an issue as it sounds. a 25 minute scrape can give between 60,000 and 75,000 URL's, and that's plenty to start with.

    Only those IMers that want to scrape literally hundreds of thousands of sites a day need worry too much about having gazillions of proxies. Personally, getting 20,000 fast poster and say 2000 slow on a batch of 10 or 12 of my WEB2 sites is ample. I might repeat a couple of times a day, but it's no big deal.

    Scritty
     
    • Thanks Thanks x 1
  10. Rogar

    Rogar Newbie

    Joined:
    Nov 6, 2011
    Messages:
    49
    Likes Received:
    4
    Hi Scritty,

    Could you try running the custom footprint I used to see if you get better results? ("powered by phpbb")

    Rogar.

     
  11. cdenet

    cdenet Registered Member

    Joined:
    Apr 25, 2009
    Messages:
    80
    Likes Received:
    41
    Occupation:
    SEO
    Location:
    Detroit, USA
    If you use public proxies:

    Go to Settings > Change Maximum Connections and set to 3. In the same settings box set to Change Private Proxy Every 1 post.

    Go to Settings > Adjust Harvester Timeout to between 30 and 60 seconds (faster just means harvesting will end faster if it hangs on a website -- like Google)

    Then Settings > Adjust Harvester Proxy Retries and set between 1 and 4

    Play with those settings and you'll see things change (hopefully better). I usually get 30 to 50 public proxies and my scraping can vary from 500 to 1000 results per harvest.

    Edit: One more thing ... Check the "Time" box. If you have it set to 24 hours, a month or anything other that "Anytime" then you're doing an Advanced search in google and filtering by time. Which will give different results than you'd normally see when searching.
     
    • Thanks Thanks x 1
  12. mnhweb

    mnhweb Regular Member

    Joined:
    Apr 29, 2009
    Messages:
    383
    Likes Received:
    78
    When I add allinurl:".edu/mediawiki/index.php" to my scrapebox and choose Footprint I only getting 17 urls after the run! But when I use the same url in google.com I get: 1.180.000 results!

    I have added 50 private proxy there all are tested and working!
    I use the setting cdenet say above!

    Can someone tell me what i'm doing wrong?

    PS: it's my first time I use scrapebox. I have watched movies for hours now and don't understand why it's not working!
     
  13. soooted

    soooted Registered Member

    Joined:
    Jun 21, 2009
    Messages:
    53
    Likes Received:
    14
    Occupation:
    ethical hacker
    Make sure you update to 1.15.37. Google and Yahoo made some changes. Gotta love the support for Scrapebox for one time fee.
     
  14. mnhweb

    mnhweb Regular Member

    Joined:
    Apr 29, 2009
    Messages:
    383
    Likes Received:
    78
    I already use version: 1.15.37
     
  15. eltel78

    eltel78 Newbie

    Joined:
    Aug 13, 2008
    Messages:
    13
    Likes Received:
    1
    Home Page:
    Having the same problem here too, scraped some good proxies with the new proxy harvester, cleared latency over 2000, Google passed, IP passed, etc. When I come to scrape..... nothing happens with Google. I even followed the above settings. How is this possible? Does Google keep changing things? I am on latest v1.15.37
     
  16. cody41

    cody41 Power Member

    Joined:
    Jun 18, 2009
    Messages:
    682
    Likes Received:
    274
    Location:
    Texas
    Are they google enabled proxies?
     
  17. mnhweb

    mnhweb Regular Member

    Joined:
    Apr 29, 2009
    Messages:
    383
    Likes Received:
    78
    Yes all my proxies is google enabled
     
  18. jb2008

    jb2008 Senior Member

    Joined:
    Jul 15, 2010
    Messages:
    1,158
    Likes Received:
    972
    Occupation:
    Scraping, Harvesting in the Corn Fields
    Location:
    On my VPS servers
    FFS guys use hrefer. I use 3000 working public proxies and scrape on 1000 threads. God knows how many URLs I get per second but I can rip through hundreds of millions, even billions of URLs per scrape.

    Scrapebox is for beginner small scale scrapes, to get used to scraping, footprints etc. If you want any real power / volume , you have to graduate to hrefer.
     
  19. Fripper

    Fripper Jr. VIP Jr. VIP Premium Member

    Joined:
    Jan 8, 2012
    Messages:
    411
    Likes Received:
    142
    Usually it is the proxies that stops you. I always check if I don't get many results first minute or so I check the proxies and take away the bad ones. When I try again it usually works great. Remember to use atleast 100+ public proxies.
     
  20. Petedurango

    Petedurango Newbie

    Joined:
    Dec 1, 2011
    Messages:
    5
    Likes Received:
    0
    Google has different rules for different searches. They will allow proxies for standard searches without footprints in url or site command, however they will give you a 302 error (ip ban) if you perform more involved searches (with previopusly passed proxy) using url or site commands. I am using Bing, Yahoo, and AOL as well, and find that Bing has the best results. Check out looplinescraper video (youtube) titled "Google IP Bans - Different kinds of bans"
    Cheers.