1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Some advice on scraping and proxies (From a developers perspective)

Discussion in 'Black Hat SEO Tools' started by cody41, Feb 23, 2012.

  1. cody41

    cody41 Power Member

    Joined:
    Jun 18, 2009
    Messages:
    682
    Likes Received:
    274
    Location:
    Texas
    If you're building a tool, and it involves running query after query against Google, please integrate the following features as I've just gotten done testing and fine tuning and can give some real world impressions..

    1) Integrate with Proxy Multiply or Proxy Goblin. Can't say this enough. The integration is simple enough really. The plugin that I've just developed for wordpress pulls down a proxy list of FRESH, Google enabled proxies from one of my servers, where Proxy Multiply uploads a new list every 1/2 hour.

    My plugin queries google to see if backlinks are indexed or not. Needless to say, this EATS up a ton of proxies. For example, I've got one project (website) in my plugin that I keep track of both keyword rankings in Google, and whether uploaded links in tabs: General, Bookmarking, GuestBook, Web 2.0, Directory Submissions, etc are all indexed or not.

    And that's just for 1 website I'm tracking. The same plugin is tracking 30 sites right now. Yea, I've got roughly 1/2 million links I needed to see whether they're indexed or not. Got proxy?

    2) Have your code rotate proxies after every 2 or 3 lookups. This will save proxies in the long run. This actually applies mainly to private proxies so that you don't burn them out to quiickly

    3) Please make sure that if you code for proxy scraping in your app (i don't hence the reliance upon other proxy tools mentioned above), that if you're scraping, make sure you use a good proxy judge and that the proxies are google enabled.