1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox Question - How To Filter Out Certain Domains In List

Discussion in 'Black Hat SEO Tools' started by HelloInsomnia, Nov 16, 2010.

  1. HelloInsomnia

    HelloInsomnia Jr. Executive VIP Jr. VIP Premium Member

    Joined:
    Mar 1, 2009
    Messages:
    1,814
    Likes Received:
    2,910
    Basically I want to take a large list and filter out all the sites that are from certain domains such as:

    Code:
    http://en.wikipedia.org
    http://www.youtube.com
    
    How do I do this with a list I've already scraped, is there a way to check it against my local blacklist file?

    Also, is there a way to filter out domains that are root in my list as well, I only want inner pages..
     
  2. andreyg13

    andreyg13 Jr. VIP Jr. VIP

    Joined:
    Nov 13, 2009
    Messages:
    915
    Likes Received:
    1,774
    Occupation:
    SEO
    Location:
    http://seoshark.org
    Home Page:
    just add them to local blacklist file
     
  3. bakxos

    bakxos Regular Member

    Joined:
    Aug 8, 2010
    Messages:
    498
    Likes Received:
    292
    Location:
    Scotland
    You import the first list. You then select import and you choose the option import and compare on domain level. You browse your folders and select the file with all the domains you do not want to your list to contain. You import and you are done.
     
    • Thanks Thanks x 2
  4. HelloInsomnia

    HelloInsomnia Jr. Executive VIP Jr. VIP Premium Member

    Joined:
    Mar 1, 2009
    Messages:
    1,814
    Likes Received:
    2,910
    Ah I see, after adding it I had to clear the list and reload it. Okay that takes care of that but what about filtering out only root domains and keeping inner pages?
     
  5. bakxos

    bakxos Regular Member

    Joined:
    Aug 8, 2010
    Messages:
    498
    Likes Received:
    292
    Location:
    Scotland
    thats a bit more complicated.
    1.you import the list of urls which you want the root domain to exclude
    2.You press trim to root and you get all the root domains(if you already have the root domain skip these steps).
    3.You save the root domains.
    4.You import the urls that you want to use
    5.You press import and then choose import and compare on url level.
    6.You browse the url list of the root domains you want to exclude and you are done.
     
  6. pisco

    pisco Regular Member

    Joined:
    Aug 13, 2010
    Messages:
    222
    Likes Received:
    42
    Location:
    Lisbon
    Have you tried you putting in the footprint -"hxxp://whatever.com" ?, I use a similar footprint to remove any results that include closed comment blogs.

    **EDIT**

    My bad, this doesn't do much for you since you already scraped the url's, but still might be helpful to someone who wants to filter out specific domains
     
    Last edited: Nov 16, 2010
  7. HelloInsomnia

    HelloInsomnia Jr. Executive VIP Jr. VIP Premium Member

    Joined:
    Mar 1, 2009
    Messages:
    1,814
    Likes Received:
    2,910
    I don't know the domains I just want to filter out any root domain if that's possible. If not I suppose I can throw it in excel and sort it by length and do it that way.
     
  8. bakxos

    bakxos Regular Member

    Joined:
    Aug 8, 2010
    Messages:
    498
    Likes Received:
    292
    Location:
    Scotland
    its quite simple.

    Import the list you want and press trim to root.
    Save the trimmed list.
    Import again the first list.
    Then choose import and compare urls and browse to the trimmed to root list (step 1 and 2).
    You are done.
     
    • Thanks Thanks x 1