1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox - merged huge lists of keywords and footprints

Discussion in 'Black Hat SEO' started by cozziola, Jun 15, 2014.

  1. cozziola

    cozziola Regular Member

    Joined:
    May 20, 2014
    Messages:
    288
    Likes Received:
    5
    For anyone that would like a great list of footprints for SB see here ---- http://www.blackhatworld.com/blackh...35925-get-all-footprint-blogs-forums-etc.html

    I have watched a handful of videos on SB so i am by no means an expert.

    What i have done is scape a list of keywords with goes into the huge thousands. I have then merged it with the above footprint list.

    I want to start harvesting now but SB is saying it has almost 4 MILLION keywords.

    Am i right in thinking this list is too big? How do i go about tackling this? Maybe dumb some footprints?

    I will be using the urls with GSA.
     
  2. cozziola

    cozziola Regular Member

    Joined:
    May 20, 2014
    Messages:
    288
    Likes Received:
    5
    Sheeeeet! I think i'm supposed to only use 1 footprint at a time? This is gonna take me around 15 years to get through the list!

    Am i missing something here guys?
     
  3. cozziola

    cozziola Regular Member

    Joined:
    May 20, 2014
    Messages:
    288
    Likes Received:
    5
    I am now trying to build a list with SB using GSA article footprints. I have 1.7 million keywords being harvested. 1000 per keyword i think.

    Can someone tell me if this is the norm or if my list is way too big.

    The last thing i want to be doing is harvesting for days on end if it isnt needed :)
     
  4. trish1

    trish1 BANNED BANNED

    Joined:
    May 29, 2014
    Messages:
    322
    Likes Received:
    187
    Thanks for this thread just downloaded the file
     
  5. cozziola

    cozziola Regular Member

    Joined:
    May 20, 2014
    Messages:
    288
    Likes Received:
    5
    no problem mate :)
     
  6. mindmaster

    mindmaster Jr. VIP Jr. VIP Premium Member

    Joined:
    Sep 16, 2010
    Messages:
    2,501
    Likes Received:
    1,135
    Location:
    at my new office
    You can scrape for multiple footprints at one time.

    If you PC/VPS is not that powerful just break down your kw/footprint list.
    Use a bunch of proxies and you will finish it in no time.

    If it turns out to be to much time/money spend, you can always buy a GSA list from the BST section. ;)
     
  7. cozziola

    cozziola Regular Member

    Joined:
    May 20, 2014
    Messages:
    288
    Likes Received:
    5
    VPS is pretty powerful with more than enough memory.

    I have 50 paid for shared proxies.

    Google, yahoo, bing, all have 3 connections each.

    google avg urls/s = 0
    yahoo avg urls/s = 10
    bing avg urls/s = 10

    Does this sound fast enough for you man? Can i speed this up.
     
  8. Sweetfunny

    Sweetfunny Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 13, 2008
    Messages:
    1,747
    Likes Received:
    5,038
    Location:
    ScrapeBox v2.0
    Home Page:
    If you use the merge feature and merge a footprint file with a keyword file, it appends every keyword to every footprint. So if you have 3 footprints and 3 keywords it will multiply them together and create 9 search strings in total like:

    Footprint 1 - Keyword 1
    Footprint 1 - Keyword 2
    Footprint 1 - Keyword 3
    Footprint 2 - Keyword 1
    Footprint 2 - Keyword 2
    Footprint 2 - Keyword 3
    Footprint 3 - Keyword 1
    Footprint 3 - Keyword 2
    Footprint 3 - Keyword 3

    If you merge 10,000 keywords with 1,000 footprints you will create 10 million search strings, to put it in perspective if you choose to then harvest just 100 urls per search string your telling it to harvest potentially 1,000,000,000 urls from every search engine you select.

    So i don't think you are missing anything, just not realizing how huge the data is you are creating.

    In the harvester_sessions folder, ScrapeBox will save the harvested urls in 1 million url batches in real time. So you could grab the first batch of 1 million urls while ScrapeBox is still harvesting, then run them through GSA. ScrapeBox will harvest 1 million urls quicker then GSA will post to them, so when GSA if finished the first 1 million there will be more 1 million batch files saved in ScrapeBox ready to go. So you could do it this way.
     
    Last edited: Jun 16, 2014
  9. cozziola

    cozziola Regular Member

    Joined:
    May 20, 2014
    Messages:
    288
    Likes Received:
    5
    Thats a great tip man thanks! But wont i need to use SB to check over the first million URLS to see if they are good or not?

    How does my connections and speed sound to you?

     
    Last edited: Jun 16, 2014
  10. strovolo

    strovolo Registered Member

    Joined:
    Aug 5, 2009
    Messages:
    90
    Likes Received:
    12
    Location:
    Sin City
    Home Page:
    Scrapebox is a GREAT tool but when it comes to footprint scraping, I Have only one word to say, Gscraper.
     
  11. asap1

    asap1 Jr. VIP Jr. VIP

    Joined:
    Mar 25, 2013
    Messages:
    4,407
    Likes Received:
    2,830
    Occupation:
    Quality Control PBN
    Home Page:
    Your urls per second could be better, its low