1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

A tool to harvest all pages from a website?

Discussion in 'Black Hat SEO Tools' started by lablinks, Jan 20, 2011.

  1. lablinks

    lablinks Senior Member

    Joined:
    Apr 22, 2010
    Messages:
    926
    Likes Received:
    171
    Is there a tool that can crawl a website & return urls for all the pages on it?
     
  2. roberteb

    roberteb Regular Member

    Joined:
    Oct 30, 2010
    Messages:
    402
    Likes Received:
    120
    Location:
    UK
    You could use Xenu (slow on a really large site) or if all the pages are in the sitemap.xml you could use scrapebox sitemap link extractor plugin.
     
  3. lablinks

    lablinks Senior Member

    Joined:
    Apr 22, 2010
    Messages:
    926
    Likes Received:
    171
    is there a similar soft that works with proxies?
     
  4. lineguy

    lineguy Registered Member

    Joined:
    Apr 21, 2010
    Messages:
    70
    Likes Received:
    23
    I use httrack (it's free).

    Not sure if it's set up to work with proxies though.
     
  5. oinky222

    oinky222 Regular Member

    Joined:
    Oct 2, 2010
    Messages:
    389
    Likes Received:
    175
    Scrapebox
     
  6. lablinks

    lablinks Senior Member

    Joined:
    Apr 22, 2010
    Messages:
    926
    Likes Received:
    171
    I don't think SB can do it.
     
  7. SweetChuck

    SweetChuck Junior Member

    Joined:
    May 18, 2009
    Messages:
    165
    Likes Received:
    87
    Home Page:
    It can harvest all urls in a website with the sitemap scraper plugin
     
  8. ahiddenman

    ahiddenman Elite Member

    Joined:
    Dec 11, 2010
    Messages:
    2,647
    Likes Received:
    2,087
    Location:
    204.15.23.255
    Yeah Sb can do this :)
     
  9. theironlemming

    theironlemming Junior Member

    Joined:
    May 6, 2010
    Messages:
    148
    Likes Received:
    40
    Yup, scrapebox is definitely the tool I'd used for tasks like these.
     
  10. sirgold

    sirgold Supreme Member

    Joined:
    Jun 25, 2010
    Messages:
    1,260
    Likes Received:
    645
    Occupation:
    Busy proving the Pareto principle right
    Location:
    A hot one
    wget in recursive mode
    Code:
    wget -r website
     
  11. flexnds

    flexnds Power Member

    Joined:
    Jan 4, 2010
    Messages:
    643
    Likes Received:
    680
    Occupation:
    Internet Marketing, Web development, Internet Repu
    Location:
    AZ
    yep, scrapebox. scrapebox dot com /bhw it's only 57 bucks
     
  12. lablinks

    lablinks Senior Member

    Joined:
    Apr 22, 2010
    Messages:
    926
    Likes Received:
    171
    OK, I've checked. SB doesn't do it. It needs the site to have a sitemap in order to harvest the links.
     
  13. vishalgmistry

    vishalgmistry Regular Member

    Joined:
    Sep 25, 2008
    Messages:
    321
    Likes Received:
    520
    if its a static website only html and images httrack will do your work.
     
  14. flexnds

    flexnds Power Member

    Joined:
    Jan 4, 2010
    Messages:
    643
    Likes Received:
    680
    Occupation:
    Internet Marketing, Web development, Internet Repu
    Location:
    AZ
    then generate a sitemap using a free tool online. There are countless free ones. just google it
     
  15. flexnds

    flexnds Power Member

    Joined:
    Jan 4, 2010
    Messages:
    643
    Likes Received:
    680
    Occupation:
    Internet Marketing, Web development, Internet Repu
    Location:
    AZ
    PHP:
    http://www.xml-sitemaps.com/
    up to 500 pages for free
     
  16. service365

    service365 Newbie

    Joined:
    Jan 18, 2011
    Messages:
    20
    Likes Received:
    0
    I need this too .