1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Any way to extract links from these kind of pages

Discussion in 'Black Hat SEO Tools' started by Knoxgates, Mar 27, 2012.

  1. Knoxgates

    Knoxgates Supreme Member

    Joined:
    Aug 9, 2008
    Messages:
    1,266
    Likes Received:
    918
    As the title says anyway to extract links from these kind of pages. Like i have a links from ezine articles
    Code:
    http://ezinearticles.com/?Find-Out-How-To-Recover-Deleted-Files&id=6951780
    When i m trying to extract links from this page i get only these links using scrapebox link extractor

    Code:
    http://ezinearticles.com/
    http://ezinearticles.com/?id=6965016&The-Importance-Of-Having-A-Server-Backup=
    http://ezinearticles.com/?type=experts
    http://ezinearticles.com/about.html
    http://ezinearticles.com/advertise/
    http://ezinearticles.com/affiliates/
    http://ezinearticles.com/author-terms-of-service.html
    http://ezinearticles.com/benefits/
    http://ezinearticles.com/cartoons/
    http://ezinearticles.com/contact.html
    http://ezinearticles.com/editorial-guidelines/
    http://ezinearticles.com/endorsements/
    http://ezinearticles.com/faq/
    http://ezinearticles.com/premium/
    http://ezinearticles.com/privacy-policy.html
    http://ezinearticles.com/rss/
    http://ezinearticles.com/sitemap.html
    http://ezinearticles.com/submit/
    http://ezinearticles.com/subscribe/
    http://ezinearticles.com/terms-of-service.html
    http://ezinearticles.com/training/
    http://ezinearticles.com/videos/ 
    But there are other URL's in this format like this[which i need to extract]

    Code:
    <a href="/?Store-Sensitive-Information-Using-Windows-Server-Backup&id=6712577">Store Sensitive Information Using Windows Server Backup</a>
    <a href="/?What-to-Look-for-in-a-Server-Backup-Program&id=6169284">What to Look for in a Server Backup Program</a>
    <a href="/?Exchange-Server-Backup---A-Need-for-All-Businesses&id=6404457">Exchange Server Backup - A Need for All Businesses</a>
    <a href="/?Linux-Server-Backup---Unmatched-Server-Client-Backup&id=6368706">Linux Server Backup - Unmatched Server-Client Backup</a>
    <a href="/?Why-You-Need-a-Windows-Server-Backup&id=6617088">Why You Need a Windows Server Backup</a>
    <a href="/?Speed-Benefits-of-Image-Based-Server-Backups&id=2063515">Speed Benefits of Image Based Server Backups</a>
    <a href="/?Top-10-PC-and-Server-Backup-Features&id=5654062">Top 10 PC and Server Backup Features</a>
    <a href="/?Windows-Server-Backup-Top-Features&id=6408791">Windows Server Backup Top Features</a>
    <a href="/?How-An-Online-Server-Backup-Works?&id=4024999">How An Online Server Backup Works?</a>
    <a href="/?All-About-Server-Backup-Services&id=5572304">All About Server Backup Services</a>
    <a href="/?How-To-Prevent-Data-Loss-From-Occurring&id=6965139">How To Prevent Data Loss From Occurring</a>
    <a href="/?Todays-Password-Problems-and-Solutions&id=6956004">Today's Password Problems and Solutions</a>
    <a href="/?Effective-Backup-and-Recovery-Solution-for-Your-Business&id=6955804">Effective Backup and Recovery Solution for Your Business</a>
    <a href="/?All-About-An-Online-Backup&id=6962822">All About An Online Backup</a>
    <a href="/?The-Importance-of-Backup-Solutions&id=6950660">The Importance of Backup Solutions</a>
    <a href="/?Why-You-Need-Backup-Email-Service&id=6957007">Why You Need Backup Email Service</a>
    <a href="/?Understanding-the-Role-of-Differential-Backup&id=6956997">Understanding the Role of Differential Backup</a>
    <a href="/?How-To-Choose-A-Good-Online-Storage-Site&id=6949891">How To Choose A Good Online Storage Site</a>
    <a href="/?Find-Out-How-To-Recover-Deleted-Files&id=6951780">Find Out How To Recover Deleted Files</a>
    <a href="/?Cheap-Online-Backup-Solutions-for-Your-Computer&id=6948789">Cheap Online Backup Solutions for Your Computer</a>
    
    
    http://ezinearticles has been stripped out in the source code for these links.

    Anyway to scrape these kind of pages. I have thousands of pages, It's very time consuming to do it manually.

    Please Help....
     
  2. kokoloko75

    kokoloko75 Elite Member

    Joined:
    Jan 1, 2011
    Messages:
    1,628
    Likes Received:
    1,935
    Occupation:
    Design director
    Location:
    Paris (France)
    You can use a regular expression with a regex extractor, like this one :
    Code:
    http://codecanyon.net/item/regex-extractor-extract-everything-simply-/1327433
    I tested, you should use this regular expression :
    Code:
    http://pastebin.com/b0GRy3j9
    You'll got that :

    [​IMG]

    Export results in text file, and open it with Notepad.
    Use search-and-replace, like :
    Code:
    Search : <a href="
    Replace : http://ezinearticles.com
    
    Search (without space) : & amp;
    Replace : &
    
    Search : ">
    Replace : [I]nothing[/I]
    Finally, you'll get that :

    [​IMG]

    Easy, right ?

    Beny
     
    • Thanks Thanks x 2
  3. bulldawg88

    bulldawg88 Junior Member

    Joined:
    Jan 13, 2012
    Messages:
    167
    Likes Received:
    106
    Location:
    San Diego, CA
    Code:
    http://www.webmaster-toolkit.com/link-extractor.shtml
    Bulldawg
     
  4. Knoxgates

    Knoxgates Supreme Member

    Joined:
    Aug 9, 2008
    Messages:
    1,266
    Likes Received:
    918
    @kokoloko75: Yes this should work . Rep Added

    have thousand of url's, extracting links 1 by 1 is too time consuming.
     
  5. An71qu3

    An71qu3 Junior Member

    Joined:
    Apr 26, 2009
    Messages:
    185
    Likes Received:
    166
    I can make u a custom bot for that .. add me on skype:an71qu3