1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to clean you LinkWheel Bandit URL List of DEAD Posts

Discussion in 'Black Hat SEO' started by Patchworks, Oct 5, 2011.

  1. Patchworks

    Patchworks Junior Member

    Joined:
    Jun 11, 2011
    Messages:
    105
    Likes Received:
    15
    Ok, I've been playing around with a method to clean your LWB URL list.

    It will get rid of DEAD (404) Pages and any LIVE post that all the links point to DEAD URLs.

    Ok, first EXPORT your URLs to a text file. We will Call this List A!

    First run "List A" ScrapeBox Alive Checker! I generally retest the failed a time or 2, but most of the time it is correct on the first run since we are dealing with major web2.0 sites.

    Now you save this list out to a new text file and you have now filtered out all of your 404 pages. We will call this "List B"!

    Next you do a CheckLinks and make "List B" your Website files and your "Blogs" file. You are checking all the links for existence of all of your links! Do not check the "Match Domain Only" option in the checklinks screen.

    Now you want to export out your "Founds" and "Not Founds" to separate files. This will get the VAST majority of your GOOD Posts, but we still have more work to do.

    Ok, this is the weird part, take your original "List B" which has all the 404 removed and load into ScrapeBox, then trim to root and remove dupe domains. Go thru and remove any domains that are not yours (for example, www.gather.com, http://www.blogtext.org,www.blurty.c...la.teenick.com). Save this file out to a "Domains Trimmed" list. Now do a Check Links and use your "NOT FOUND" list as your "Blogs" and the "Domains Trimmed" as your websites. When done, append the "Founds" to your EXISTING "FOUND" list and export out the "NOT FOUND" and overwrite the "NOT FOUND" list.

    Ok, let me explains. I don't know why LWB does this but it seems some of the Posts only have the TRIMMED Domain portion of the URL in the link. For example http://keywords123.bloghi.com and when we do this last step we are removing those TRIMMED Domains links from our "NOT FOUND".

    When you are done you now removed the following types of links:

    404 Pages that are simply DEAD!
    Pages that are DELETED POSTS, but the website redirects you to an ad page.
    LWB Posts that ALL of the links point to LWB Posts that are DEAD

    Have fun!!