How to build your own site lists for GSA SER

davids355

Super Moderator
Moderator
Executive VIP
Jr. VIP
Joined
Apr 25, 2011
Messages
19,795
Reaction score
28,113
Just a quick guide that will show you how to scrape your own lists for using with GSA SER - of course GSA SER has its own built in tool for scraping URLs to post to, but its not that great and also it means using resources that you really want to reserve for actually posting to URLs rather than finding them.

Things you will need:


  • GSA SER (Of course).
  • Scrapebox
  • Proxies - 30 shared proxies will do fine

STEP 1: Getting the list of footprints together

First of all you need a list of footprints and preferably the ones that GSA actually uses - you can get these by opening GSA then going to options, tools and clicking on "search online for URLs".
Once in there you can go through the engines one by one and select "Add all footprints for...". This will gradually compile a list of all the footprints that you need.
Once you have them you can tick the "Save to file" box or just copy and paste them into a text file.

v2iUIPi


M5JF0b0


jnbLfL5


Once you have done this for all engines you should end up with around 2300 footprints.

STEP 2: Getting the list of keywords together

The next thing you need is a list of keywords - I normally do this for each project so that I get a targeted list of keywords and end up with a relatively unique if not perfectly targeted list of URLs at the end of it all.

So open up Scrapebox and click on Scrape and then keyword scraper.

XDFBKoe


Put a couple of seed keywords in here for your niche and then click on Start. Once it completes click on "Transfer results to left side" and then click start again. Leave it running for a while until you get a nice list of keywords.

XumVydT


GrYjVcl


Once you have this list export it to a text file.

QWpBsyo


STEP 3: Merging your list of keywords with your footprints to create a list of Google search queries.

Go back to the main Scrapebox screen, clear anything currently in the keyword section and import your list of keywords that you previously scraped.

Then click on "M" and import your list of footprints as well - this will merge the two together to create a nice list of search queries.

Personally when I do this I use around 2500 keywords along with the 2300 footprints and this gives you around 5 million queries - if you have more keywords than that you can reduce it down using Dup remove - instructions included in the next step.

Once you have the list of keyword+footprint combinations save them to a new text file.

QgnZqDv


STEP 4: Randomizing and splitting your list

This isnt imperative but I normally randomize my list so that I get an even amount of URLs for all of the different platforms. I also normally split my list because I don't need to scrape using 5 million queries.
So if you goto addons in scrapebox and install dup remove you can acheive this easily.
Once installed open dup remove and first randomize the list and then open that randomized list and split it. I personally split my lists into 10k files and then scrape using those one at a time.

p4GEyki


IofOXEK



STEP 5: Begin harvesting

Now that you have a set of 10,000 randomized keyword+footprint combinations you can import them back into scrapebox and begin scraping - just make sure you clear anything already in the keyword section, import the 10k list and then click on "Start harvesting".

With 30 proxies and a 4 core VPS I normally leave this running for 12-24 hours and then stop it. This should give you 1-2 million URLs.

Once that is complete close the harvester and they will be transfered into the main window where you can remove duplicates and then save the list.

jvB42Wk


STEP 6: Sorting the list in to GSA

Once you have this list it needs to be sorted in to GSA. This can be done by clicking on options, tools and then Import URLs. In my experience importing a file often fails so I open the text file first, copy all URLs and then use the procedure above but select "Import from clipboard". Once you have done that GSA will start looking through the list and importing them into its "Identified" list.

STEP 7: Start posting

Importing 2 million URLs will probably give you somewhere between 300k - 500k URLs that GSA can post to and once you have that you are good to go.

Q7SubaQ


LPM after using this method:

vE4Moux
 
Amazing guide! Should be a sticky and mandatory reading for all newbies getting into SEO automation.
 
That looks awesome davids, I was searching for something similar. Good job without the fluff straight to the point. Thank you
 
you should be promoted asap David , great share :)
 
Thanks guys :-) Glad you like it, its probably quite simple process to most but for anyone starting out with GSA hopefully its useful :-)
 
Don't want to sound like Im repeating other posters, but this is a great post. I know when I first used GSA this would have come in handy. GSA looks intimidating as a newb so this will certainly come in handy for those just starting out.
 
Nice and easy, good post here for the new guys.
 
Thanks for the comments guys, glad it's appreciated.
 
Short, but effective guide, lot's of BHW members are going to benefit from that for sure. Good job OP !
 
Back
Top