1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Which tool do you recommend to remove duplicates from a 20 million URLs .txt file?

Discussion in 'Black Hat SEO Tools' started by bk071, Apr 3, 2011.

  1. bk071

    bk071 Jr. Executive VIP Jr. VIP Premium Member

    Joined:
    Nov 24, 2010
    Messages:
    3,105
    Likes Received:
    7,917
    Occupation:
    I don't have a job
    Location:
    .............
    So which one do you guys use?
    Surely, SB will crash when it comes to lists of that size.
    And I've tried DupeRemove addon. It does the job fine but takes forever to write unique URLs in the target txt file.

    Any suggestions?
    Bk...
     
  2. youssef93

    youssef93 Senior Member

    Joined:
    Sep 14, 2008
    Messages:
    828
    Likes Received:
    1,148
    Occupation:
    Student, Part-time Online Marketer
    Location:
    Egypt
  3. bk071

    bk071 Jr. Executive VIP Jr. VIP Premium Member

    Joined:
    Nov 24, 2010
    Messages:
    3,105
    Likes Received:
    7,917
    Occupation:
    I don't have a job
    Location:
    .............
  4. m_zec

    m_zec Jr. VIP Jr. VIP Premium Member

    Joined:
    May 3, 2010
    Messages:
    331
    Likes Received:
    440
    I use DupeRemove addon and it is doing good job... If it is taking lots of time, try to split list in two lists each 10 mil urls and then put one by one in dup remove.. After that put them together and again in duperemove... More work, but faster remove duplicate urls..

    Or you can try this tool from loopline:

    Code:
    http://www.blackhatworld.com/blackhat-seo/black-hat-seo-tools/279487-scrapebox-helper-tools-free.html
     
  5. youssef93

    youssef93 Senior Member

    Joined:
    Sep 14, 2008
    Messages:
    828
    Likes Received:
    1,148
    Occupation:
    Student, Part-time Online Marketer
    Location:
    Egypt