1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

scraping pages from an autoapprove domain

Discussion in 'Black Hat SEO' started by cloakndagger, Feb 23, 2011.

  1. cloakndagger

    cloakndagger Power Member

    Joined:
    Oct 31, 2010
    Messages:
    613
    Likes Received:
    173
    Ok I've been spamming,er sorry,posting alot of constructive comments and have just started checking links straight away to get auto approve links.
    Anyway I have a few and we're talking a 100 or so but I know other pages,posts are auto approved by the 1500 comments lol.
    How do I get all the pages from the url using scrapebox?
    cheers
    CnC
     
  2. movieman32

    movieman32 Regular Member

    Joined:
    Aug 6, 2008
    Messages:
    371
    Likes Received:
    346
    There are 2 easy ways. Your choice.

    1. load all your autoapprove urls into the harvester and trim to root. Remove all duplicate URLs.
    2. Save the list as a text file.

    Now comes your choice.
    1. in the footprint search box, type in site:
    2. import your list of sites (trimmed to root) into the keyword box

    Or
    1. Open your trimmed to root list and do a find and replace all.
    find http:// replace site:http://
    save the list
    2. Import the list into the keyword search box.

    Start harvesting.

    You are gonging to wind up with loads of tag pages, pdfs, and archive pages.

    What I do is open the harvested list in Excel or Open Office and do another find and replace

    find *tag* leave the replace field blank - be sure to type in the asterisks or Excel will only replace the word "tag" with a blank space (you want to delete the whole url)
    replace all - this will eliminate tons of pages you can't use for commenting

    Here are a few more common deletions I use
    *rss*
    *feed*
    *archive*
    *xml*
    *pdf*
     
    • Thanks Thanks x 1
    Last edited: Feb 23, 2011
  3. cloakndagger

    cloakndagger Power Member

    Joined:
    Oct 31, 2010
    Messages:
    613
    Likes Received:
    173
    I know I've thanked you but I'll give you a written thanks as well for such a clear method(s) for doing it.
    Once I've got around 1k of domains I'll do the above and fire the list up here.
    Thanks again movieman32 you're a star :)