1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Everyone using the same SBox comments, how to grab all URLs from such a blog?

Discussion in 'Black Hat SEO Tools' started by seeplusplus, Jun 14, 2011.

  1. seeplusplus

    seeplusplus Power Member

    Joined:
    Aug 18, 2008
    Messages:
    511
    Likes Received:
    163
    So lots of people are using the same spun comments and when I search for "a comment string" I find plenty of auto approve blogs.

    So I was wondering how to scrape all the blog posts/urls from such an AA blog?

    For example one blog I found has around 200 posts, how can I get all the URLs which are on the site so I can run SBox and post a comment onto each separate blog post within that site?

    Thanks very much|Thanks a lot|Cheers :D
     
  2. Autumn

    Autumn Elite Member

    Joined:
    Nov 18, 2010
    Messages:
    2,197
    Likes Received:
    3,041
    Occupation:
    I figure out ways to make money online and then au
    Location:
    Spamville
    1) See if there's a sitemap and if there is, parse all the post links from it.

    2) Spider the site and look for comment forms. An easy way is to fetch the index page and all the pages linked to from the "older posts" or "previous posts" link, then fetch every url with #comments in it, or where the anchor text is "No comments" or "[0-9]+ comments"

    3) site:domain.com +comments, site:domain.com "add comment" etc
     
    • Thanks Thanks x 1
  3. seeplusplus

    seeplusplus Power Member

    Joined:
    Aug 18, 2008
    Messages:
    511
    Likes Received:
    163

    Thanks buddy, I guess I could use this as a footprint in ScrapeBox then.
     
  4. Autumn

    Autumn Elite Member

    Joined:
    Nov 18, 2010
    Messages:
    2,197
    Likes Received:
    3,041
    Occupation:
    I figure out ways to make money online and then au
    Location:
    Spamville
    It would actually be faster / more error-proof to go the sitemap route first if the blog has one. Doing any kind of query with +comments, "add comment" etc is going to get blocked by Google pretty fast.

    Assuming you're targeting WP blogs, here's a script I posted a while ago to fetch WP sitemap urls and turn them into a list of links:
    Code:
    http://www.blackhatworld.com/blackhat-seo/blogging/300926-need-all-wp-url-txt-file.html#post2725187
    
    Apparently SB has got a built in sitemap scraper too, but I am not a SB user.
     
    • Thanks Thanks x 1