Discussion in 'Black Hat SEO' started by xovian, Sep 15, 2012.
Problem Solved, Please close and remove thread.
I could do that for you, let me know what you are offering.
"IF" file 1 that has 5 million - if those lines are urls, scrapebox will do it, assuming you have scrapebox or buy it for the $57.
If the 5 million rows are just random lines of data, you can use my scrapebox classroom url analyzer tool. Its free. It would not load in 5 million lines at once, but you could break that up into like 500K chunks and try it, depends too on your available resources.
I had this up on a nice easy to use site, but the site has a DB error at the moment and haven't yet had time to fix it.
The first vid explains the concept, the 2nd vid explains the editing of filters. You would just have to make a filter that contained your file 2 with your couple of thousand keywords. It would need to be in RegEx format. But then you would wind up with the results you wanted. Mind you its going to be SLOW with a few thousand keywords in there, and you might need to go in chunks lower then 500K. But its a line by line filter, so it will do what you want.
Its over 32MB so VT won't scan it.
As for splitting it, you can use the free scrapebox dupe remove addon, the file splitter, merger and duplicate "url" removal are all line by line argement based, so they don't work with just urls they work based on any data in a given line.
All that said, if godmonkee can make something for you, thats probably going to be more robust. My tool was written in python and it doesn't seem to very robustly handle large data sets. Something in C# or Delphi or something else would be more ideal. But then Im no expert on programing either, so whatever the dev things is prob best.
Problem Solved. Please close thread.
Separate names with a comma.