1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrape 100k keywords... How?

Discussion in 'Other Languages' started by gheegh, Oct 8, 2011.

  1. gheegh

    gheegh Newbie

    Joined:
    Oct 12, 2009
    Messages:
    38
    Likes Received:
    2
    Hey all,

    I'm doing some research.. trying to develop a strategy. And, ideally.. i'd like to get the top 100 google, bing, and maybe ask or one other search engine's results for about 100k keywords.

    Any thoughts on how to do this? Even with Proxies, I'm going to need thousands of them.

    Your thoughts would be great.

    G
     
  2. HelloInsomnia

    HelloInsomnia Jr. Executive VIP Jr. VIP Premium Member

    Joined:
    Mar 1, 2009
    Messages:
    1,816
    Likes Received:
    2,912
    You could do this with Scrapebox. Just load up the keywords into the harvester, set the results to 100 and scrape away.
     
  3. cyberzilla

    cyberzilla Elite Member Premium Member

    Joined:
    Nov 15, 2009
    Messages:
    2,204
    Likes Received:
    3,363
    Location:
    zeta reticuli
    Just get Scrapebox!
     
  4. gheegh

    gheegh Newbie

    Joined:
    Oct 12, 2009
    Messages:
    38
    Likes Received:
    2
    Scrapebox and what proxies? I understand that I can get them with Scrapebox. The problem is finding enough proxies to do it.
     
  5. Kickflip

    Kickflip BANNED BANNED

    Joined:
    Jan 29, 2010
    Messages:
    2,038
    Likes Received:
    2,465
    You don't need many proxies. If you can find 20-30 working proxies which scrapebox can find using the preloaded sources, then you are fine. Just restart your campaign every day and re-test your proxies.
     
  6. gheegh

    gheegh Newbie

    Joined:
    Oct 12, 2009
    Messages:
    38
    Likes Received:
    2
    You think so?

    Right now, I'm blowing through about 100 proxies for 10k keywords. To get to what I am trying to do, I need about 10 times the volume. So, I'm trying to figure out if there is a good solution.

    I am wondering if Scrapebox does something besides use proxies to help with Google shutting me down?
     
  7. Kickflip

    Kickflip BANNED BANNED

    Joined:
    Jan 29, 2010
    Messages:
    2,038
    Likes Received:
    2,465
    Blowing through 100 proxies? I don't know what that means. I can scrape 1 million keywords from 10 proxies in a day. There must be something else you are doing wrong.
     
    • Thanks Thanks x 1
  8. Kickflip

    Kickflip BANNED BANNED

    Joined:
    Jan 29, 2010
    Messages:
    2,038
    Likes Received:
    2,465
    I just complete this scrape since my last post.

    [​IMG]

    Of those, 18,000 were unique keywords.
     
  9. eternalfrost

    eternalfrost Regular Member

    Joined:
    Apr 9, 2011
    Messages:
    213
    Likes Received:
    54
    free proxies and scrape box will do this easily.
    for that amount, you will need to recurse the process a few times but shouldnt be a big problem.

    just remember that many thousands of other people are using those free proxies for who knows what so of course they are going to get 'burned out', thats mostly not from your actions though...
     
  10. thinkdevoid

    thinkdevoid Regular Member

    Joined:
    Aug 30, 2011
    Messages:
    446
    Likes Received:
    131
    Location:
    192.168.1.1
    Another tip is also (if you wish) to find your own sources and not use the public ones included with Scrapebox. Those tend to be saturated with other scrapeboxers using those sources.

    It may be just me, but as soon as I started adding my own sources, things started to look up :cool:
     
  11. dcuthbert

    dcuthbert Regular Member

    Joined:
    Jun 15, 2011
    Messages:
    411
    Likes Received:
    249
    there was a 350k keyword list here somewhere, you could just download that and save yourself the trouble ;)
     
  12. eternalfrost

    eternalfrost Regular Member

    Joined:
    Apr 9, 2011
    Messages:
    213
    Likes Received:
    54
    yes, i made the assumption in my earlier post that you were doing this...

    the pre-packaged proxy sources that come with SB are burnt up to hell. find some of your own.
     
  13. gheegh

    gheegh Newbie

    Joined:
    Oct 12, 2009
    Messages:
    38
    Likes Received:
    2
    Guys,

    I'm not looking to build keywords lists.. I already have those. I'm looking to scrape Google results to understand why/what is ranking.. and to track those rankings over time (looking at the individual pages that are ranking to develop trends).

    I'm trying to go 100 deep.. so I iterate over a "keyword", pulling positions 1 to 100.

    Maybe that's the difference?

    I'm going to spin up my Scrapebox and load a few thousand in, and try that.. and see if it blows out my proxies as well.

    Any other thoughts would be welcome.
     
  14. xpressioniz

    xpressioniz Junior Member

    Joined:
    Jun 4, 2008
    Messages:
    121
    Likes Received:
    15
    I think he was trying to "harvest" the keyword, not the "scrape" keyword feature on the scrapebox.

    It's painful and you have to prepare lots of proxies. I have no idea what the solution for this except for getting more proxies.
     
  15. HelloInsomnia

    HelloInsomnia Jr. Executive VIP Jr. VIP Premium Member

    Joined:
    Mar 1, 2009
    Messages:
    1,816
    Likes Received:
    2,912
    Sounds like you need something like SE Scout. That will help you track where the sites are ranking for the keywords. But you have to enter all the data in manually at first - there is no way to just scrape them and export them.
     
  16. johndea

    johndea Regular Member

    Joined:
    Jun 23, 2011
    Messages:
    308
    Likes Received:
    35
    How many keywords do you want to go 100 deep for?

    Why is it a problem to go 10 pages deep (100 results)?
     
  17. dunker

    dunker Newbie

    Joined:
    Jun 16, 2010
    Messages:
    15
    Likes Received:
    1
    When you're going "deep", it's usually best to first compile a list of all links that you were going to crawl, make sure these links are all unique (you filter out duplicated links) and then proceed with crawling.

    This way you don't end up visiting multiple subpages more than once.