1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Google Serp scrape: Ban

Discussion in 'Black Hat SEO' started by thelastnewbie, Jun 22, 2010.

  1. thelastnewbie

    thelastnewbie Newbie

    Joined:
    Dec 6, 2008
    Messages:
    48
    Likes Received:
    0
    When I scrape google serps I get banned after any requests although I changed proxy ip. Is there any other ban like user agent or cookie ban?
     
  2. trophaeum

    trophaeum Senior Member

    Joined:
    Dec 21, 2007
    Messages:
    1,189
    Likes Received:
    706
    goog uses heurs to detect likely scrapers, pick a simpler/different footprint and have lots of proxies ready lol
     
  3. G0D0VERY0U

    G0D0VERY0U Regular Member

    Joined:
    Apr 6, 2010
    Messages:
    439
    Likes Received:
    282
    Also, look for a google_com.xml file on your hard disk, it's normally hidden so be the net admin when looking. I don't get this file with SB or Hrefer, but I don't know what app you are using to scrape with - it may apply in your case. Delete the file, change your proxy and make sure it's fully annony.
     
    • Thanks Thanks x 1
  4. nikoman

    nikoman Regular Member

    Joined:
    May 15, 2010
    Messages:
    289
    Likes Received:
    65
    Location:
    Twilight Zone
    Clear cookies every time you get banned.
     
  5. thelastnewbie

    thelastnewbie Newbie

    Joined:
    Dec 6, 2008
    Messages:
    48
    Likes Received:
    0
    Is there a simply method to clear cookies in java?
    In which directory is the xml file on a windows vista system when I use
    Java http requests?
     
  6. rocket

    rocket Regular Member

    Joined:
    Apr 14, 2009
    Messages:
    471
    Likes Received:
    131
    Occupation:
    Web developer and marketer
    Location:
    In my competitor's mind
    Can you turn java off and then scrape? Never tried.
     
  7. botdevs

    botdevs Newbie

    Joined:
    May 15, 2010
    Messages:
    18
    Likes Received:
    8
    Occupation:
    Developer
    Location:
    California
    Home Page:
    What nikoman said, clear cookies before you change proxies. It's not anything else.
     
  8. thelastnewbie

    thelastnewbie Newbie

    Joined:
    Dec 6, 2008
    Messages:
    48
    Likes Received:
    0
    I scrape with my own java software so I cannot turn java off.
    I cannot override the proxies, are there any java classes available to clear
    all of them?
     
  9. rocket

    rocket Regular Member

    Joined:
    Apr 14, 2009
    Messages:
    471
    Likes Received:
    131
    Occupation:
    Web developer and marketer
    Location:
    In my competitor's mind
    you might want to try a VPN. I use mine for scraping and haven't had a problem yet. Make sure its a good VPN though. I may be just lucky, who knows.
     
  10. thelastnewbie

    thelastnewbie Newbie

    Joined:
    Dec 6, 2008
    Messages:
    48
    Likes Received:
    0
    What do you mean with VPN?

    I cannot find this xml file.
     
    Last edited: Jun 22, 2010
  11. Yukinari84

    Yukinari84 Elite Member

    Joined:
    Dec 12, 2007
    Messages:
    2,474
    Likes Received:
    4,665
    Occupation:
    I'm retired ;p
    Location:
    Somewhere in space...
    You need to lots of proxies and clean your cookies every time you burn out an IP.
     
  12. rocket

    rocket Regular Member

    Joined:
    Apr 14, 2009
    Messages:
    471
    Likes Received:
    131
    Occupation:
    Web developer and marketer
    Location:
    In my competitor's mind
    Virtual Private Network. I'm not sure if that would work any better than proxies because it basically is like a proxy but is a virtual environment. Its worked for me, but I don't do real heavy scraping. Just google virtual private network.
     
  13. Yukinari84

    Yukinari84 Elite Member

    Joined:
    Dec 12, 2007
    Messages:
    2,474
    Likes Received:
    4,665
    Occupation:
    I'm retired ;p
    Location:
    Somewhere in space...
    I use HotSpot Shield for scraping Google and I have to say it does seem to outperform a lot of proxies. At least free proxies.

    Plus it's so damn easy to change your IP with it.