Need a tool that will delete duplicate domains

hellomotow07

Power Member
Aug 24, 2010
643
356
Ok, so i have been using ScrapeBox and have a big list of domains.

Are there any programs out there that will delete duplicate domains (NOT URLS) just like the "Remove duplicate domains" button in ScrapeBox. It needs to be able to import a list of text files and of course keep 1 copy of the domain (not completely delete it)
 
What's wrong with Scrapebox's function? You can import text files into it and remove duplicates, then save back to text file pretty quickly.

Or did I miss something?
 
What's wrong with Scrapebox's function? You can import text files into it and remove duplicates, then save back to text file pretty quickly.

Or did I miss something?
I have over 100 million urls, that will take a hours not to mention SB has a limit of 1 million so after i get 1 million Unique domains i wont be able to import/remove duplicate
 
I can throw together something in perl if you'd like, it would be a good exercise for me, interested? If you and a couple others want just post in this thread and I'll see if I can come up with something to post here in the next week or so...
 
I can throw together something in perl if you'd like, it would be a good exercise for me, interested? If you and a couple others want just post in this thread and I'll see if I can come up with something to post here in the next week or so...
That would be great. Right now im using Batch KeyWord Cleaner but that only removes duplicate URLS not domains.

/blackhat-seo/member-downloads/145457-get-batch-keyword-cleaner-great-manipulating-url-keyword-lists.html#post1319388
 
I have over 100 million urls, that will take a hours not to mention SB has a limit of 1 million so after i get 1 million Unique domains i wont be able to import/remove duplicate

100million Urls? Then Do 1million Urls 100-times [should be done within an hour]
 
100million Urls? Then Do 1million Urls 100-times [should be done within an hour]
Would take like 2 hours of doing boring tedious work (and ill be getting even more, so if there are any programs out there that can do this it would save me lots of time) and even then they wont all be unique. I can only import upto 1 million domains into SB, after about 10-15 million urls ill have that many and wont be able to import anymore.
 
Back
Top
AdBlock Detected

We get it, advertisements are annoying!

Sure, ad-blocking software does a great job at blocking ads, but it also blocks useful features and essential functions on BlackHatWorld and other forums. These functions are unrelated to ads, such as internal links and images. For the best site experience please disable your AdBlocker.

I've Disabled AdBlock