1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scraping Google

Discussion in 'Black Hat SEO Tools' started by iByte, Feb 3, 2012.

  1. iByte

    iByte Newbie

    Joined:
    Jan 24, 2012
    Messages:
    4
    Likes Received:
    1
    I am looking to scrape Google and was looking to see which proxy provider would make sense to use. Also does using a VPN instead get better results (in terms of Google blocking the scrape)?
     
  2. Scritty

    Scritty Elite Member Premium Member

    Joined:
    May 1, 2010
    Messages:
    2,807
    Likes Received:
    4,496
    Occupation:
    Affiliate Marketer
    Location:
    UK
    Home Page:
    A few providers here. Pro proxy, Squid and others. All will start you out with proxies able to scrape Google and provide proper ID cloaking,

    However, Google is fecking touchy these days. More than about 20 requests for info in as many minutes, and they will slap a temporary ban on that IP.

    My wife has even had a 90 minute ban on our home internet IP just for looking for some stuff to buy for my eldests birthday.

    It used to be you'd get a few hundred scrapes before G got twitchy - now it's a few dozen.

    If the same job can be done in Yahoo (and most jobs can) I would use them instead.
    I'm not saying DON'T scrape G, just be fecking careful these days. Scrapebox on 50+ threads can burn out 25 full private proxies in 5 minutes.

    I do scrapes in scrapebox now with i thread per 5 proxies and go into settings and set it to rotate every request - it STILL gets funny if I have a long list of keywords.

    12 months ago it was all so darn easy.

    Article sites took articles.
    Pron forums actually had pron on them
    G let you scrape a lot more
    Twitter let you hammer automated tools a lot more
    CL let you post a lot more

    The upside is - most of the lazy idiots are getting out of IM - and there is roughly the same money to round that their always was - so more for those prepared to work.

    Waffling now.

    /Scritty Out
     
    • Thanks Thanks x 1
  3. iByte

    iByte Newbie

    Joined:
    Jan 24, 2012
    Messages:
    4
    Likes Received:
    1
    Thanks for the dope, Scritty.

    I had used squidproxy and have had these issues and therefore was looking to improve my strategy. I can expand to a large number of proxies but they should be reliable. I get the captcha screen for a random 20 odd IPs from my set of 100. I could replace them, but the result is the same after every attempt.

    I don't know if changing proxy providers would solve this.

     
    • Thanks Thanks x 1
  4. ~divinci

    ~divinci Registered Member

    Joined:
    Sep 23, 2009
    Messages:
    51
    Likes Received:
    11
    Occupation:
    Infastructure Reverse Engineer
    Location:
    Liverpool UK
    What method are you using to scrape? WebBrowser automation or some sockets?

    I will be scraping.soon from the UK using WB with rotating cookie profiles and will let you know how it goes.

    I would also like to do tests with XForwarded header but i havent built that yet :)
     
  5. ehinoze

    ehinoze Power Member

    Joined:
    Feb 1, 2011
    Messages:
    674
    Likes Received:
    108
    Occupation:
    Internet marketing
    Location:
    London
    I recomend you use a vpn service like hidemyass e.t.c, that will allow you to rotate ip's as frequently as you want.
     
  6. omida86

    omida86 Power Member

    Joined:
    Feb 15, 2011
    Messages:
    791
    Likes Received:
    181
    Occupation:
    SEO Consultant, Business Web Developer
    Location:
    Earth
    Where do you set it to rotate every request when scraping? I can only see that I can rotate after every post...
     
  7. mcurtis

    mcurtis Newbie

    Joined:
    May 30, 2011
    Messages:
    34
    Likes Received:
    2
    So will this work with scraping with scrapebox? This sounds like a much cheaper deal than to buy shared and private proxies or am i looking at this wrong?
     
  8. kokoloko75

    kokoloko75 Elite Member

    Joined:
    Jan 1, 2011
    Messages:
    1,628
    Likes Received:
    1,935
    Occupation:
    Design director
    Location:
    Paris (France)
    Scrape only with public proxies.
    Use private proxies/VPN/... only for blog commenting.

    Beny
     
  9. alaltaierii

    alaltaierii Supreme Member

    Joined:
    Jun 11, 2010
    Messages:
    1,408
    Likes Received:
    349
    You can also scrape with private proxies if you have at least 20-30 and you use a low number of connections.
     
  10. SkyIsTheLimit

    SkyIsTheLimit BANNED BANNED

    Joined:
    Feb 11, 2012
    Messages:
    69
    Likes Received:
    12
    I use public proxies for scraping...
     
  11. iByte

    iByte Newbie

    Joined:
    Jan 24, 2012
    Messages:
    4
    Likes Received:
    1
    I seem to get captchas with public proxies... I thought these were temporary. Is it possible that a proxy could be permanently black listed?