1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[HOW TO]Cleanup & Mantain URLs lists

Discussion in 'Black Hat SEO Tools' started by natostanco, Nov 28, 2011.

  1. natostanco

    natostanco Junior Member

    Joined:
    Jul 23, 2011
    Messages:
    138
    Likes Received:
    13
    This is my way to cleanup URLs lists with scrapebox without using any addon.

    This is useful because you can do this after a blast of the list so that you don't waste much time with it, in fact addons like alive checker even if pretty fast it is still needs to run the entire list another time to check for urls.

    My way to do it:

    - Blast the list
    - After you have run the blast you export the error log
    - open the error log in notepad++
    - press CTRL+f
    - go on Mark tab
    - check "bookmark line" and "match whole word only" boxes
    - enter one by one this words and press "mark all" button:
    Code:
    FAILED, 0
    FAILED, 501
    FAILED, 404
    FAILED, 500
    FAILED, 301
    FAILED, 302
    FAILED, 303
    FAILED, 300
    FAILED, 503
    FAILED, 400
    FAILED, 401
    FAILED, 402
    unable to post
    comments closed
    no comment form
    care about "unable to post" since it will be the one that will cut down the list the most.
    - now go on menu again search->bookmark->remove bookmarked lines
    - wait till it ends, save the list.

    notes: The error entries list are IMHO the ones that have a major probability to be permanent errors, while I would avoid timeouts and captchas...

    Can be time consuming running all the lines to bookmark with each keyword.

    How do you do it? any methods? Does anyone think this should be included as a new addon in scrapebox, called "List Mantainer" or something that should be able to:
    1- import error log from and ended blast
    2- let the user choose wich error/lines eliminate checking the one he wants from a list
    3- profit
     
    • Thanks Thanks x 1
    Last edited: Nov 29, 2011
  2. natostanco

    natostanco Junior Member

    Joined:
    Jul 23, 2011
    Messages:
    138
    Likes Received:
    13
    found this soft that removes lines containing particular keywords:

    Code:
    http://www.sobolsoft.com/removeline/
    no one here update their lists? >.<
     
  3. extremephp

    extremephp BANNED BANNED

    Joined:
    Oct 19, 2010
    Messages:
    1,293
    Likes Received:
    1,272
    Scrapebox can export Failed,Posted,Captcha Required,Posted Urls already. Why extra work? -_-
     
  4. natostanco

    natostanco Junior Member

    Joined:
    Jul 23, 2011
    Messages:
    138
    Likes Received:
    13
    because among the failed ones there are still working ones and the time to do this would be really cheap
     
  5. Thesis

    Thesis Newbie

    Joined:
    Aug 29, 2011
    Messages:
    17
    Likes Received:
    7
    I am also trying to streamline the process and tried to make use of the error log. IMHO there aren't any error codes (XXX - Numbers) worth retrying, perhaps with the exception of 408, which if I understand correctly means that the server timed out waiting for the client.
    I take the captha list directly from scrapebox and run it with the slow poster. For the remaining, the only moderate success I had was to take the
    - unable to post
    - comments closed
    lists and run trackbacks against them.

    It would be interesting to know how we could use the error log to identify platforms to apply scrapebox learn mode. So far I haven't found an easy way to group the list into any meaningful cluster.