1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Extract URLs from Text?

Discussion in 'Black Hat SEO' started by iitob, Mar 7, 2010.

  1. iitob

    iitob Jr. VIP Jr. VIP Premium Member

    Joined:
    Jun 26, 2009
    Messages:
    139
    Likes Received:
    16
    Got a text file full of URL's but there are random bits of text between that prevent me from throwing them straight into a PR checking tool. Anyone know of a good (ideally free) tool for extracting URLs from text or a text file?
     
  2. Kid Shaleen

    Kid Shaleen Regular Member

    Joined:
    Oct 29, 2009
    Messages:
    250
    Likes Received:
    63
    Search the web for a free reasonably powerful text editor and addtional filter utility if needed.

    1) load text file. search and replace spaces and end-of-sentence characters (i.e. "." "!" "?") with line feeds;

    2) filter file and output only those lines containing "http" or "www" or both.
     
  3. slackgen99

    slackgen99 Junior Member

    Joined:
    Nov 8, 2009
    Messages:
    144
    Likes Received:
    20
    I am using TextPipe, you can extract replace remove etc anything
     
    • Thanks Thanks x 2
  4. iitob

    iitob Jr. VIP Jr. VIP Premium Member

    Joined:
    Jun 26, 2009
    Messages:
    139
    Likes Received:
    16
    Would work out only there are words/phrases/sentences between some of the URLs. I need to isolate just the URLs amongst them.

    Also tried out TextPipe, but couldn't work out how to apply filters, the different steps... very confusing for me, and not so practical to be using on a regaular basis. Appreciate the suggestion though.
     
  5. lobo13

    lobo13 Regular Member

    Joined:
    Nov 3, 2009
    Messages:
    297
    Likes Received:
    75
    Location:
    └A
    then u need to use SENuke that`ll do the job for you...
     
  6. iitob

    iitob Jr. VIP Jr. VIP Premium Member

    Joined:
    Jun 26, 2009
    Messages:
    139
    Likes Received:
    16
    Seems like overkill to buy it simply for it's URL extracting capabilities.
     
    • Thanks Thanks x 1
  7. Micallef

    Micallef Supreme Member

    Joined:
    Apr 29, 2009
    Messages:
    1,345
    Likes Received:
    1,221
    Occupation:
    SE Manipulator
    Location:
    London, UK
    Home Page:
    • Thanks Thanks x 6
  8. iitob

    iitob Jr. VIP Jr. VIP Premium Member

    Joined:
    Jun 26, 2009
    Messages:
    139
    Likes Received:
    16
  9. fansfans

    fansfans Regular Member

    Joined:
    Aug 21, 2009
    Messages:
    343
    Likes Received:
    140
    trying to download and follow this line. thank you
     
  10. johnrapp

    johnrapp Registered Member

    Joined:
    May 4, 2011
    Messages:
    72
    Likes Received:
    8
    Occupation:
    CEO @ TCB
    Location:
    Greenwood, IN
    Home Page:
  11. Black.Star

    Black.Star Junior Member

    Joined:
    Oct 4, 2011
    Messages:
    185
    Likes Received:
    1,028
    Occupation:
    IT security specialist
    Location:
    Europe
    Linux...

    Nuff said :)
     
  12. teh_crush

    teh_crush Newbie

    Joined:
    Sep 19, 2011
    Messages:
    4
    Likes Received:
    4
    How is the text formatted? Is it url's mixed with text or is it raw html?

    There are solutions in PHP, but you would have to know what your txt file looks like.
     
  13. mrtwister_65

    mrtwister_65 Regular Member

    Joined:
    Dec 30, 2009
    Messages:
    462
    Likes Received:
    534
    Easiest solution is - Web Developer Addon for Firefox.

    Go to "Information" > "View Link Information". It will open new tab with links on page only. All other text and images will be removed. That is if your links are inside of html file.
     
    • Thanks Thanks x 1
    Last edited: Oct 14, 2011
  14. shezboy

    shezboy Jr. VIP Jr. VIP Premium Member

    Joined:
    Sep 17, 2008
    Messages:
    3,573
    Likes Received:
    5,044
    Gender:
    Male
    Location:
    UK
    Wow, now that was an easy option. Thanks for the tip. I had this add-on but didn't know it would do that. Sure will come in handy in the future too.

    Thanks again,

    Shez :)