1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Need to filter URLs with certain keywords in them, from a list of 5,000+ how can I do it?

Discussion in 'Black Hat SEO Tools' started by jon_xx_x, Feb 20, 2013.

  1. jon_xx_x

    jon_xx_x Jr. VIP Jr. VIP

    Joined:
    Nov 15, 2008
    Messages:
    3,108
    Likes Received:
    1,458
    I have a list of 5000+ URLs, and I need to remove the ones that have /forum/ or /forums/ in them.
    I don't have scrapebox anymore.
    Is there any options in Open Office or Notepad++ that can do such a thing?
    Or any other suggestions, without having to pay?
    Thanks.
     
  2. Schvamp

    Schvamp Power Member

    Joined:
    Feb 13, 2012
    Messages:
    684
    Likes Received:
    549
    Location:
    Hogwarts
    A PHP script, skype me :)
     
  3. innozemec

    innozemec Jr. VIP Jr. VIP

    Joined:
    Aug 19, 2011
    Messages:
    5,289
    Likes Received:
    1,799
    Location:
    www.Indexification.com
    Home Page:
  4. Schvamp

    Schvamp Power Member

    Joined:
    Feb 13, 2012
    Messages:
    684
    Likes Received:
    549
    Location:
    Hogwarts
    Sorry, yes notepad++ works great for this. Hit CRTL + F and checkout the replace tab :)
     
  5. innozemec

    innozemec Jr. VIP Jr. VIP

    Joined:
    Aug 19, 2011
    Messages:
    5,289
    Likes Received:
    1,799
    Location:
    www.Indexification.com
    Home Page:
    have you even read what OP needs?

    he doesn't want to replace, he wants to remove lines containing certain keyword
     
  6. theMagicNumber

    theMagicNumber Regular Member

    Joined:
    May 13, 2010
    Messages:
    345
    Likes Received:
    195
    Notepad++
    CRTL+F -> Replace -> check "Regular expression" radio button
    use the following strings:
    /forum/.*\r\n
    /forums/.*\r\n
    and replace with an empty string. DONE the lines are removed, exactly what the OP wants.
     
    • Thanks Thanks x 1
  7. innozemec

    innozemec Jr. VIP Jr. VIP

    Joined:
    Aug 19, 2011
    Messages:
    5,289
    Likes Received:
    1,799
    Location:
    www.Indexification.com
    Home Page:
    ah, i didn't know notepad++ has regular expressions, sorry

    great to learn that!
     
  8. Bruut

    Bruut Regular Member

    Joined:
    Aug 9, 2012
    Messages:
    227
    Likes Received:
    149
    Raptor3
     
  9. Ashgriel

    Ashgriel Junior Member

    Joined:
    Jan 21, 2013
    Messages:
    168
    Likes Received:
    29
  10. _zeroone

    _zeroone Newbie

    Joined:
    Jan 25, 2013
    Messages:
    46
    Likes Received:
    2
    type file1.txt | find /v "/forum/" | find /v "/forums/" >> file2.txt
     
  11. jon_xx_x

    jon_xx_x Jr. VIP Jr. VIP

    Joined:
    Nov 15, 2008
    Messages:
    3,108
    Likes Received:
    1,458
    Thanks so much.
    THanks to all the other replies too. I'll check them out if this doesn't work.
     
  12. kingtana

    kingtana Regular Member

    Joined:
    Nov 1, 2008
    Messages:
    277
    Likes Received:
    73
    Location:
    WWW
    Home Page:
    textmechanic dot com should be able to handle it
     
  13. blackberry

    blackberry Power Member

    Joined:
    Apr 26, 2009
    Messages:
    675
    Likes Received:
    218
    Occupation:
    Making money
    Location:
    Planet Earth
    Excel will do it too, if you use filters such as:
    TEXT FILTER > CONTAINS > enter in the /forum/ and then delete the results.
    repeat for "/forums/

    Then your left with the results you want to use.