1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

scrape url's of my article site

Discussion in 'Black Hat SEO Tools' started by Krusty, Feb 23, 2015.

  1. Krusty

    Krusty Registered Member

    Joined:
    Jul 14, 2014
    Messages:
    86
    Likes Received:
    35
    Is there a scraper I can use to scrape the url's of all my articles on my website. there is over 1,000 articles and I don't want to do it manually. I want to be able to submit all the links to scrapebox.
    Is scrapebox designed to scrape a single website for all url's?
    any suggestions?
     
  2. cottonwolf

    cottonwolf Regular Member

    Joined:
    Jan 20, 2015
    Messages:
    469
    Likes Received:
    239
    I believe scraping your own website for link using the "site:yourdomain.com" should work. It gives you the links indexed by google I believe
     
  3. BassTrackerBoats

    BassTrackerBoats Super Moderator Staff Member Moderator Jr. VIP

    Joined:
    Mar 10, 2010
    Messages:
    16,774
    Likes Received:
    30,796
    Occupation:
    Selling CPA Sites
    Location:
    Not England
    Home Page:
    There is a tool called screaming frog that will do that for you.

    I'm sure that scrape box does that as well
     
    • Thanks Thanks x 1
  4. bartosimpsonio

    bartosimpsonio Jr. VIP Jr. VIP Premium Member

    Joined:
    Mar 21, 2013
    Messages:
    12,492
    Likes Received:
    11,190
    Occupation:
    CHEAP
    Location:
    DATASETS
    Home Page:
    You should probably create a sitemap script for your own site and get the list of urls from there.
     
    • Thanks Thanks x 1
  5. M4XW3LL

    M4XW3LL Jr. VIP Jr. VIP

    Joined:
    Feb 5, 2013
    Messages:
    1,094
    Likes Received:
    1,275
    You can use xml-sitemaps.com , it will generate a sitemap of your website with all the URLs.
     
  6. Automated

    Automated Regular Member

    Joined:
    Jun 7, 2012
    Messages:
    289
    Likes Received:
    123
    Location:
    Online
    If its wordpress based, there are a number of sitemap plugins you can install that will help you out...
     
  7. Florent1933

    Florent1933 Newbie

    Joined:
    Nov 23, 2010
    Messages:
    35
    Likes Received:
    30
    Using screaming frog would probably be the most accurate solution. just take care not to put too much threads otherwise your own server could ban your ip (it already happened to me... ^^)
     
  8. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,807
    Likes Received:
    2,028
    Gender:
    Male
    Home Page:
    As Bass Tracker Boats noted, Screaming frog will actually "crawl" your website.

    If you want to use Scrapebox, and there is a sitemap (Which I doubt or you wouldn't be asking, but if there is) then you can use the sitemap addon and select deep crawl in the settings.

    Else you can do a

    site:domain.com
    in google for example and then load the results into the link extractor addon and extract internal links. Then export the results and load them back in and extract internal. Keep doing this until you are happy that you have all the urls.

    Whichever program you use bear in mind not to turn the connections up too high, it may seem like you want to go faster but remember you are hammering away on your own webserver and you could take it down. I inadvertantly crashed a webserver using Scrapebox. :)
     
    • Thanks Thanks x 1