1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Anyone Know How To Rip All URLS From A Website?

Discussion in 'Black Hat SEO' started by mudbutt, Jul 29, 2011.

  1. mudbutt

    mudbutt Jr. Executive VIP Jr. VIP Premium Member

    Joined:
    Jun 23, 2010
    Messages:
    1,817
    Likes Received:
    4,284
    Location:
    ghosted
    Theres a website that has 6k indexed pages in Google...the pages aren't linked to one another (its a secret directory of sorts) and there is no sitemap. Does anyone know how I can get all of the URLS from that site? Google only shows 1k results.. Any help would be appreciated, thanks!
     
  2. brutang

    brutang Junior Member

    Joined:
    Nov 5, 2009
    Messages:
    193
    Likes Received:
    81
    Check out a program called Xenu Link Sleuth. You give it a home page and it crawls the entire site. It'll even give you the URLs to their images, scripts, etc but you can filter all of that out later... just make sure to not scan for external/outbound links (it's an option).
     
  3. Shishko

    Shishko Regular Member

    Joined:
    Jan 14, 2010
    Messages:
    210
    Likes Received:
    86
    Occupation:
    Student/Gigolo
    Location:
    Europe
    If you have scrapebox you could do this.

    Write a custom footprint of site:domain.com
    and then add a bunch of keywords which will give different results every time and if there is plenty diversified keywords, you will be able to harvest all the urls.
     
  4. flyingbear

    flyingbear Junior Member

    Joined:
    Mar 7, 2011
    Messages:
    195
    Likes Received:
    19
    well, waht you need is a sitemap generator. search GSiteCrawler , give it the homepage, it will crawl everything. then output the link list. it is open source
     
  5. CoyoteAssassin

    CoyoteAssassin Elite Member

    Joined:
    Jan 3, 2010
    Messages:
    1,862
    Likes Received:
    3,906
    Occupation:
    Full Time IMer
    Location:
    USA
    If you are looking for a sitemap program, try A-1 Sitemap Generator. It will crawl for days looking for everything. No page limits like most.

    If I saw the website, I would be able to provide a solution. I do this type of work every day.