1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to scrape perfect lists for gsa ?

Discussion in 'Black Hat SEO Tools' started by s3ctum, Jul 20, 2014.

  1. s3ctum

    s3ctum Registered Member

    Joined:
    Jun 27, 2014
    Messages:
    86
    Likes Received:
    2
    What's the best method on how to scrape perfect lists for gsa ?
     
  2. lord1027

    lord1027 Elite Member

    Joined:
    Sep 20, 2013
    Messages:
    3,177
    Likes Received:
    2,239
    I recommend taking a look at Footprint Factory, they have a BST here.
     
  3. divok

    divok Senior Member

    Joined:
    Jul 21, 2010
    Messages:
    1,067
    Likes Received:
    648
    Location:
    .IN
    No one can scrape perfect lists. Everyone has different approaches as to how they scrape lists. Some go the footprint route , while others use a seed list to make their own lists .

    I suggest that you first find which engines are more successful for you . tools > stats >list1 vs list2 > submitted vs succesfull .
    then import the data into excel , find which engines have a % age of success . Start from above and gradully move to engines with lower succes ratio. If you have shortage of time take 1 engine per day .

    Now GSa ser has a inbuilt footprint studio, no need to buy anything else. tools > footprint studio . Feed websites of the selected engine into , it will give you a footprint list . Use that for scraping . You can use keywords too ,but list will be much smaller .

    I scrape footprints by hand using software posted by s4ntos.
     
    • Thanks Thanks x 1
  4. MikeyMikey13

    MikeyMikey13 Supreme Member

    Joined:
    May 25, 2014
    Messages:
    1,411
    Likes Received:
    389
    How do I get on these tools? I can not find it
     
  5. Scritty

    Scritty Elite Member Premium Member

    Joined:
    May 1, 2010
    Messages:
    2,862
    Likes Received:
    4,556
    Gender:
    Male
    Occupation:
    Affiliate Marketer
    Location:
    UK
    AS mentioned above footprint factory is very good (it's a pattern matching tool at heart - but a pretty good one - the paid version is pretty cheap and worth the few bucks)
    GSA has a great inbuilt site scraper that uses the exact footprints built in to it's "Machine.txt" file. Remember, if you scrape using a footprint GSA doesn't recognise it might not recognise the site when it comes to create a link and give a false negative on site detection.
    It looks for what it knows, so if you are scraping for a particular tool, scraping with its own inbuilt footprint list ensures it scrapes what it will later recognise and be able to post to.

    The issue? It's slow (ish) When I use it to scrape I'll leave it on for 25-30 days working through 25,000 keywords across maybe 40 search engines and it will come back with maybe half a million confirmed sites a month. Hferer will give me a list with that quantity inside 72 hours.

    Scritty
     
  6. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,853
    Likes Received:
    2,046
    Gender:
    Male
    Home Page:
    In GSA SER go to options >> advanced >> tools >> Footprint Studio.

    Also if you want to use the included footprints in GSA SER, which makes sense to start there. Just go to options >> advanced >> tools >> search online for urls. Then click add predefined footprints. Then choose the platform you want and it will put the footprints in the footprints box.

    The you can just copy those from GSA into the keywords box in Scrapebox, hit the M merge button and choose a keyword list, and merge it. Then scrape away. Then load the list back into GSA by right clicking on your project and importing target urls.

    Also bear in mind when merging footprints with keywords, if you have a keyword list with 1000 keywords and you grab every footprint in GSA and load in 1000 footprints that would result in 1 million combos. So you can see if you load in 1000 footprints and 100,000 keywords, your looking at 100 million resulting combos which is too high of a target to start with.