1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox- 30,000,000 urls scraped... cant remove dupes

Discussion in 'Black Hat SEO Tools' started by keith88, Jan 10, 2014.

  1. keith88

    keith88 Regular Member

    Joined:
    Sep 14, 2010
    Messages:
    287
    Likes Received:
    23
    Occupation:
    Internet Marketer
    Location:
    Home
    hey guys,

    I'm trying to clean up a massive list with SB but when i attempt to remove duplicates it says the version only supports ansi text files...

    How can i get around this????
     
  2. fxphil

    fxphil Senior Member

    Joined:
    Jul 16, 2010
    Messages:
    1,084
    Likes Received:
    504
    Split it up maybe into 1mil each, then paste them right into notepad then copy them from notepad into scrapebox.

    By pasting them into notepad it turns it into te most basic txt automatically then copying it to SB should work.
     
  3. tahworld

    tahworld Regular Member

    Joined:
    Aug 16, 2013
    Messages:
    457
    Likes Received:
    393
    Location:
    ✔✔✔✔✔✔✔
  4. keith88

    keith88 Regular Member

    Joined:
    Sep 14, 2010
    Messages:
    287
    Likes Received:
    23
    Occupation:
    Internet Marketer
    Location:
    Home
    hmmm I suppose theres no way around that eh... I did plan to split the file but only after the dupes were removed from the master list.
     
  5. LarryKowalsky

    LarryKowalsky Newbie

    Joined:
    Jan 7, 2014
    Messages:
    14
    Likes Received:
    3
    Home Page:
    Use gscraper, which dont have limit row in load lists
     
  6. zenlagor

    zenlagor Regular Member

    Joined:
    Apr 4, 2013
    Messages:
    357
    Likes Received:
    184
    Occupation:
    Virtual Pimp
    Location:
    Colombia
    Home Page:
    Try Textpad or excel.
     
  7. keith88

    keith88 Regular Member

    Joined:
    Sep 14, 2010
    Messages:
    287
    Likes Received:
    23
    Occupation:
    Internet Marketer
    Location:
    Home
    I tried but freezes up on me....
     
  8. Asif WILSON Khan

    Asif WILSON Khan Executive VIP Premium Member

    Joined:
    Nov 10, 2012
    Messages:
    10,119
    Likes Received:
    28,558
    Gender:
    Male
    Occupation:
    Fun Lovin' Criminal
    Location:
    London
    Home Page:
  9. keith88

    keith88 Regular Member

    Joined:
    Sep 14, 2010
    Messages:
    287
    Likes Received:
    23
    Occupation:
    Internet Marketer
    Location:
    Home
    Able to handle massive files??
     
  10. Winternacht

    Winternacht Junior Member

    Joined:
    Jan 7, 2011
    Messages:
    113
    Likes Received:
    46
  11. keith88

    keith88 Regular Member

    Joined:
    Sep 14, 2010
    Messages:
    287
    Likes Received:
    23
    Occupation:
    Internet Marketer
    Location:
    Home
    Yup thats what I'm using bro.....

    thats the one that gives me the ansi message...

    problem is I cant even open the file because its so large...
     
  12. tahworld

    tahworld Regular Member

    Joined:
    Aug 16, 2013
    Messages:
    457
    Likes Received:
    393
    Location:
    ✔✔✔✔✔✔✔
    Hey Keith you can PM me the file. I'll split it for you for free.
     
  13. Robbie1

    Robbie1 Junior Member

    Joined:
    Jan 16, 2012
    Messages:
    111
    Likes Received:
    20
    One of scrapebox's free addons is an ansi converter. Try running it through there to see if stops the error from coming up.
     
  14. Javado

    Javado Junior Member

    Joined:
    Jun 26, 2012
    Messages:
    186
    Likes Received:
    78
    Occupation:
    Youtubes
    Location:
    Youtubes
    You can send me the list and i will remove the dupes.
     
  15. alaltaierii

    alaltaierii Supreme Member

    Joined:
    Jun 11, 2010
    Messages:
    1,408
    Likes Received:
    349
  16. We Go

    We Go Junior Member

    Joined:
    May 8, 2012
    Messages:
    163
    Likes Received:
    54
    Using Excel 2013, my experience is that doing duplicate removal with the data tool was quick even with 200K rows, but running any macro would hit a known barrier at 32767 rows.
     
  17. kinaks

    kinaks Junior Member

    Joined:
    Mar 27, 2012
    Messages:
    179
    Likes Received:
    44
    Location:
    Philippines
  18. bartosimpsonio

    bartosimpsonio Jr. VIP Jr. VIP Premium Member

    Joined:
    Mar 21, 2013
    Messages:
    8,895
    Likes Received:
    7,484
    Occupation:
    ZLinky2Buy SEO Services
    Location:
    ⇩⇩⇩⇩⇩⇩⇩⇩⇩⇩⇩⇩
    Home Page:
    There's this little program called Linux which comes with every imaginable text processing tool ever deviced. Then run this

    Code:
    cat bigfile.txt | sort | uniq > newfile.txt
    BTW you can get this stuff for Windows. Search for cygwin
     
  19. blackhat777

    blackhat777 Elite Member

    Joined:
    Jun 25, 2011
    Messages:
    1,779
    Likes Received:
    653
    Sort the file by name, split it into parts of 1 million each.
    Remove duplicates from each file by importing it in scrapebox harvestor. Then again merge the records within single files and again remove the duplicates.
    Will need some manual work.


    Otherwise, if you have a database, just create a table with one column, insert everything in that table and then remove the duplicates.