Discussion in 'Black Hat SEO Tools' started by s3ctum, Jul 20, 2014.
What's the best method on how to scrape perfect lists for gsa ?
I recommend taking a look at Footprint Factory, they have a BST here.
No one can scrape perfect lists. Everyone has different approaches as to how they scrape lists. Some go the footprint route , while others use a seed list to make their own lists .
I suggest that you first find which engines are more successful for you . tools > stats >list1 vs list2 > submitted vs succesfull .
then import the data into excel , find which engines have a % age of success . Start from above and gradully move to engines with lower succes ratio. If you have shortage of time take 1 engine per day .
Now GSa ser has a inbuilt footprint studio, no need to buy anything else. tools > footprint studio . Feed websites of the selected engine into , it will give you a footprint list . Use that for scraping . You can use keywords too ,but list will be much smaller .
I scrape footprints by hand using software posted by s4ntos.
How do I get on these tools? I can not find it
AS mentioned above footprint factory is very good (it's a pattern matching tool at heart - but a pretty good one - the paid version is pretty cheap and worth the few bucks)
GSA has a great inbuilt site scraper that uses the exact footprints built in to it's "Machine.txt" file. Remember, if you scrape using a footprint GSA doesn't recognise it might not recognise the site when it comes to create a link and give a false negative on site detection.
It looks for what it knows, so if you are scraping for a particular tool, scraping with its own inbuilt footprint list ensures it scrapes what it will later recognise and be able to post to.
The issue? It's slow (ish) When I use it to scrape I'll leave it on for 25-30 days working through 25,000 keywords across maybe 40 search engines and it will come back with maybe half a million confirmed sites a month. Hferer will give me a list with that quantity inside 72 hours.
In GSA SER go to options >> advanced >> tools >> Footprint Studio.
Also if you want to use the included footprints in GSA SER, which makes sense to start there. Just go to options >> advanced >> tools >> search online for urls. Then click add predefined footprints. Then choose the platform you want and it will put the footprints in the footprints box.
The you can just copy those from GSA into the keywords box in Scrapebox, hit the M merge button and choose a keyword list, and merge it. Then scrape away. Then load the list back into GSA by right clicking on your project and importing target urls.
Also bear in mind when merging footprints with keywords, if you have a keyword list with 1000 keywords and you grab every footprint in GSA and load in 1000 footprints that would result in 1 million combos. So you can see if you load in 1000 footprints and 100,000 keywords, your looking at 100 million resulting combos which is too high of a target to start with.
Separate names with a comma.