1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

need tool for filter huge urls text file

Discussion in 'Black Hat SEO Tools' started by fuad_2000, May 9, 2017.

  1. fuad_2000

    fuad_2000 Junior Member

    Joined:
    Mar 13, 2010
    Messages:
    117
    Likes Received:
    10
    hi guys

    I have a text file containing 100.000.000 url's and i want filter this file
    i have used scrapebox for filter it but scrapebox not support biggest file

    i need help for find tool can support huge file for i can filter it from Duplicate url's

    thank you
     
  2. Botsky

    Botsky Jr. VIP Jr. VIP

    Joined:
    Oct 2, 2015
    Messages:
    266
    Likes Received:
    42
  3. fuad_2000

    fuad_2000 Junior Member

    Joined:
    Mar 13, 2010
    Messages:
    117
    Likes Received:
    10
    what mean write to me ????????
     
  4. Botsky

    Botsky Jr. VIP Jr. VIP

    Joined:
    Oct 2, 2015
    Messages:
    266
    Likes Received:
    42
    Send me PM message maybe i will write it for you.
     
  5. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,875
    Likes Received:
    2,058
    Gender:
    Male
    Home Page:
    Just use the scrapebox dupe remove addon, and split the file into like 50 million or 25 million urls. Then filter it, then you can use the same addon to put the filter back together (if you want it back together). The addon is supposed to support upto like 180 million lines.
     
    • Thanks Thanks x 1
  6. fuad_2000

    fuad_2000 Junior Member

    Joined:
    Mar 13, 2010
    Messages:
    117
    Likes Received:
    10
    hi loopline
    i find it and i will test it and feedback you
     
    Last edited: May 10, 2017
  7. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,875
    Likes Received:
    2,058
    Gender:
    Male
    Home Page:
    sounds good.

    The main grid just works differently, I "think" the addon works with memory mapped files, which lets it work with larger files, but of course its not displaying the files in a GUI. Vs the main window grid loads them so you can do stuff and displays them, but works differently.
     
  8. bossofthebosses

    bossofthebosses Jr. VIP Jr. VIP

    Joined:
    Feb 7, 2015
    Messages:
    691
    Likes Received:
    282
    How do u want to filter the file? Like deleting duplicate lines? Any other operation u want to do with it?
     
  9. fuad_2000

    fuad_2000 Junior Member

    Joined:
    Mar 13, 2010
    Messages:
    117
    Likes Received:
    10
    deleting duplicate lines
     
  10. bossofthebosses

    bossofthebosses Jr. VIP Jr. VIP

    Joined:
    Feb 7, 2015
    Messages:
    691
    Likes Received:
    282
    Then u can just do what loopline suggested - use the scrapebox dupe remove addon