1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Compare harvested URL Lists

Discussion in 'Making Money' started by foppaGG, Feb 22, 2017.

  1. foppaGG

    foppaGG Newbie

    Joined:
    Jul 20, 2016
    Messages:
    33
    Likes Received:
    13
    Gender:
    Male
    Hey :)

    i need a tool to compare .txt files. I have a checkedblogs.txt file with blogs which i manually checked all ready and i have a second newblogs.txt file from scrapebox. Now i want to remove all entries from newblogs.txt which are already in checkedblogs.txt

    Example:

    checkedblogs.txt
    lol.com
    omg.com
    rofl.com

    newblogs.txt
    lol.com
    omg.com
    wtf.de

    The result should be a new file which just contains wtf.de

    Best regards
    foppaGG
     
  2. Chicilikit

    Chicilikit Senior Member

    Joined:
    Dec 21, 2010
    Messages:
    955
    Likes Received:
    188
    Why not just put both into one file amd then remove duplicates? You can do it in excel or in scrapebox, just put the data from checkedblogs twice, so it will remove everything from checkedblogs with duplicates from the newblogs so you will have only the data you need.
     
    Last edited: Feb 22, 2017
  3. uncutu

    uncutu Elite Member

    Joined:
    Aug 6, 2010
    Messages:
    1,631
    Likes Received:
    837
    you'll need microsoft excel to do it. never found online tools which have that capability.
    http://boards.straightdope.com/sdmb/showthread.php?t=621190

    he'll still be left with sites he already submitted to in the new file
     
  4. foppaGG

    foppaGG Newbie

    Joined:
    Jul 20, 2016
    Messages:
    33
    Likes Received:
    13
    Gender:
    Male
    If i put both in one file and just remove duplicates i get this result

    checkedblogs.txt
    lol.com
    omg.com
    rofl.com

    newblogs.txt
    lol.com
    omg.com
    wtf.de

    result.txt
    lol.com
    omg.com
    rofl.com
    wtf.de

    but i only want

    wtf.de

    as a result because i already checked the other blogs.
     
  5. Chicilikit

    Chicilikit Senior Member

    Joined:
    Dec 21, 2010
    Messages:
    955
    Likes Received:
    188
    You are right, with my solution excel would need to remove all duplicate records when it finds duplicates and not just one. Im not very good with excel, but if you will have this problem still tomorrow, i will make you short ubot script, now im going to bed.
     
  6. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,873
    Likes Received:
    2,057
    Gender:
    Male
    Home Page:
    Scrapebox can already do this mate. :)

    Just put your new list in the urls harvested section in scrapebox (upper right hand quadrant grid, just go to import and load the list, if its not already there) and then go to import >> import and compare on url list and select the old list that you already manually checked.

    You can also do import and compare on the domain level.
     
    • Thanks Thanks x 1