How Do You Organise Your ScrapeBox URL Lists?

Discussion in 'Black Hat SEO' started by tbrad, Oct 5, 2010.

  1. tbrad

    tbrad Registered Member

    Joined:
    Aug 11, 2010
    Messages:
    99
    Likes Received:
    11
    I'm still trying to figure out the best way of organizing my URL Lists.

    So far its pretty basic.

    I have the following files...

    HARVESTED_WordPress.txt
    HARVESTED_BlogEngine.txt
    HARVESTED_Moveabletype.txt
    HARVESTED_Captcha.txt

    With the above files, I will save the raw harvested URLs, unsorted, unpageranked, unchecked.

    And then after ive page ranked them, analyzed them to make sure they are open, I throw out the useless ones and save the good ones to master lists like this...

    MASTER_Wordpress.txt
    MASTER_BlogEngine.txt
    MASTER_MoveableType.txt
    MASTER_Captcha.txt

    Once i build up enough, id like to start splitting lists by Page Rank.

    How do you guys order yours?
    Any ideas or ways i could improve would be welcome.

    Thanks guys.
     
    Last edited: Oct 5, 2010
  2. jellyfish

    jellyfish Junior Member

    Joined:
    Sep 16, 2008
    Messages:
    185
    Likes Received:
    36
    I like working with spreadsheet(excel or whatever you like)
    also after each blast i'll check which one were auto-approve and separate it to an additional file.
    whats taking most of my time is joining all the batchs(1mil result on each batch file)and removing duplicates domains/urls.wish there was a faster way of doing this:)
     
  3. jascoken

    jascoken Senior Member

    Joined:
    Nov 1, 2010
    Messages:
    1,135
    Likes Received:
    751
    Gender:
    Male
    Occupation:
    IT/Web Systems & Development...
    Location:
    Sussex:UK
    Crazyflx has a great post on removing dupes and working with HUGE files:

    crazyflx.com/scrapebox-tips/remove-duplicate-domains-urls-from-huge-txt-files-with-ease