1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Best way to scrape millions of keywords per month from google

Discussion in 'Black Hat SEO' started by meisteralex, Jun 21, 2017.

  1. meisteralex

    meisteralex Newbie

    Joined:
    Feb 14, 2016
    Messages:
    6
    Likes Received:
    0
    Hey,

    I want to scrape round about 10-15 millions of keywords from google every month.
    My problem is the IP-Ban that occurs every 100-200 Keywords.

    I've tested some methods to avoid the IP-Ban and want to discuss with you, what is the best Method for scraping google.

    Methods I've testes are:
    - VPN-Crawler with HideMyAs, that reconnect on ervery ban.
    - DSL - Crawling with a crawler that reconnects on every ban.

    So what are your prefered methods to crawl tons of keywords?

    I've heread about Proxy-Crawing. What about buying 50 private proxies and setting the delay for every proxy to 5 Seconds? Has Anybody done this?

    Does anbody know what methods are used by the big players to fill there databases?

    Kind Regards
    ma
     
  2. BlackSeoGuy

    BlackSeoGuy BANNED BANNED

    Joined:
    Feb 4, 2015
    Messages:
    21
    Likes Received:
    1
    hi,

    the delay thing does not work. I would suggest to use a backconnect proxy?
     
  3. meisteralex

    meisteralex Newbie

    Joined:
    Feb 14, 2016
    Messages:
    6
    Likes Received:
    0
    @BlackSeoGuv Why does the delay thing not work?
     
  4. Javardo69

    Javardo69 Junior Member

    Joined:
    Jul 19, 2014
    Messages:
    106
    Likes Received:
    6
    I believe if you try 20 different keywords crawling them in a hour it triggers google

    Some solutions are you pay for a proxy rotation service or you make a script that scrapes proxy ips from different websites and you test each ip with a specific keyword and you test each ip if it delivered a clean response as expected and you add that ip into some db (it can be just a txt file lol or .csv file)

    Your keyword scraper you would load the db with fresh ips and pick randomly an ip from your db and scrape google.
     
  5. Onmef

    Onmef Newbie

    Joined:
    Jun 19, 2017
    Messages:
    33
    Likes Received:
    0
    Gender:
    Male
    Not scrape anymore, but dumping keywords
    Alternatively, you can buy cheap windows vps' and install keyword snatcher on them
     
  6. 710fla

    710fla Jr. VIP Jr. VIP

    Joined:
    Aug 25, 2015
    Messages:
    789
    Likes Received:
    227
    Use back connect proxies like StormProxies.com they rotate proxies every couple of minutes
     
  7. meisteralex

    meisteralex Newbie

    Joined:
    Feb 14, 2016
    Messages:
    6
    Likes Received:
    0
    So, I've booked a package from StrormProxies.com, but most of my request (approx. 75%) are blocked by the google server (503 - IP-Ban). What can I do ?
     
  8. cplay

    cplay Regular Member

    Joined:
    Sep 30, 2016
    Messages:
    306
    Likes Received:
    96
    Gender:
    Male
    I am also very interested in this field.

    For example how are giant companies like ahrefs, semrush etc scraping google successfully?

    Are they using rotating proxies, or what?
     
  9. arsen99

    arsen99 Jr. VIP Jr. VIP

    Joined:
    Nov 28, 2010
    Messages:
    311
    Likes Received:
    47
    Home Page:
    Maybe you could try our SCRAPE proxies which are great with G scraping.

    Not quite true, actually. If you set wrong delays, they will not work but with good ones you are able to make some long-time scraping without any errors.
    With safe settings and our SCRAPE package you can reach 10mln queries with easy. And you leave bans behind ;)

    All the best!
     
  10. meisteralex

    meisteralex Newbie

    Joined:
    Feb 14, 2016
    Messages:
    6
    Likes Received:
    0
    @arsen99 Can you tell me more about that: "With safe settings and our SCRAPE package you can reach 10mln queries with easy. And you leave bans behind"


    We're searching for a reliable solution for our business.
     
  11. redarrow

    redarrow Elite Member

    Joined:
    Apr 1, 2013
    Messages:
    5,970
    Likes Received:
    1,437
    Still awaiting for a discount code sorry to you all for butting in...
     
  12. arsen99

    arsen99 Jr. VIP Jr. VIP

    Joined:
    Nov 28, 2010
    Messages:
    311
    Likes Received:
    47
    Home Page:
    @meisteralex sure thing :)
    The main thing behind successful scraping is to set proper delays between connections per proxy. What value you will set as delay depends on many factors like for how long you are going to scrape (is it continuous scraping or some short periods), what kind of keywords are you planning to use (with or without advanced operators) and of course quantity of proxies.
    To make some safe scraping you can use our package 700 scrape proxies and set delays between 2-3 min. so you can make over 10mln queries with easily and you can do this in continuous mode.
    If you want to make some shorter scrape actions delays can be set at 30-45 sec. Of course more often you make connections through proxies to G higher risk is to catch bans.

    All the best!