1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Remove dupilcate domains in an excel full of URLS

Discussion in 'Black Hat SEO' started by Michlap, Jul 31, 2016.

  1. Michlap

    Michlap Regular Member

    Joined:
    Aug 5, 2014
    Messages:
    409
    Likes Received:
    18
    Now I want Remove all Duplicate Domain Based Url only and want keep only one url of each domain
     
  2. davids355

    davids355 Jr. VIP Jr. VIP

    Joined:
    Apr 25, 2011
    Messages:
    10,192
    Likes Received:
    7,844
    Home Page:
    If I understand correctly you have big list of urls and you could have 5 urls from same domain, you only want to keep max 1 URL per domain?

    I think you could do this with scrape box, but possibly not with excel.

    If you don't have sb then maybe you could import list into notepad+ Then use a regular expression to get the domain part from each URL, then paste that list to second column in excel, then you would have list of URL, domain

    Then filter list by unique value from domain column.
    Then copy the URL column to new spreadsheet. Bingo.
     
  3. Michlap

    Michlap Regular Member

    Joined:
    Aug 5, 2014
    Messages:
    409
    Likes Received:
    18
    you are correct, yes.

    is there no easier way to do it? online software where i can just paste the list?
     
  4. davids355

    davids355 Jr. VIP Jr. VIP

    Joined:
    Apr 25, 2011
    Messages:
    10,192
    Likes Received:
    7,844
    Home Page:
    here you go -

    Code:
    https://urltodomain.com/
     
    • Thanks Thanks x 1
  5. DigitalCon

    DigitalCon Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 27, 2014
    Messages:
    519
    Likes Received:
    88
    Gender:
    Male
    Occupation:
    Internet Research
    Location:
    Home
    Home Page:
    Use this [​IMG] under the data tab in excel. It will delete all the duplicate urls.
     
  6. Michlap

    Michlap Regular Member

    Joined:
    Aug 5, 2014
    Messages:
    409
    Likes Received:
    18

    thts not what i want tho,
     
  7. Michlap

    Michlap Regular Member

    Joined:
    Aug 5, 2014
    Messages:
    409
    Likes Received:
    18
    i tried that already, just leaves me with domains and no urls at all
     
  8. davids355

    davids355 Jr. VIP Jr. VIP

    Joined:
    Apr 25, 2011
    Messages:
    10,192
    Likes Received:
    7,844
    Home Page:
    To be honest I cant work out the next bit :-/

    But if you put those URLs into that online tool you get a list of domain names. You can paste those domain names back into Excel in a new column and then you have a list in the form URL, DOMAIN

    I am sure in excel you can filter that list to show all rows where the Domain column is unique, I cant work out how to do this but if you google it you should find.

    Or post in hire freelancer section for someone to do it for you using scrapebox - which i am sure is possible.
     
  9. davids355

    davids355 Jr. VIP Jr. VIP

    Joined:
    Apr 25, 2011
    Messages:
    10,192
    Likes Received:
    7,844
    Home Page:
    How random - I just found that I also needed to do exactly this (Remove duplicates in excel based on just one column).
    Found a guide here -

    http://www.excel-easy.com/examples/remove-duplicates.html

    Just look for this line -
    To remove rows with the same values in certain columns, execute the following steps....
     
    • Thanks Thanks x 1
  10. davids355

    davids355 Jr. VIP Jr. VIP

    Joined:
    Apr 25, 2011
    Messages:
    10,192
    Likes Received:
    7,844
    Home Page:
    ^^ It works - and it is what Mich suggested - the duplciates button.

    Do this -

    make sure your spreadsheet has columns - so call your URL column URL for example.
    copy all URLs and put them in that online tool to give you domains only - they should stay in order with that tool.
    make a new column in excel with heading domains.
    paste your list of domains in there and confirm the two columns match up.

    click aynwhere in the spreadsheet and then click data>duplicates

    only tick the domains box.

    That will remove all duplicates so you only have one row for each domain.

    then copy all URLs in your URL column.
     
  11. Asif WILSON Khan

    Asif WILSON Khan Executive VIP Jr. VIP

    Joined:
    Nov 10, 2012
    Messages:
    12,164
    Likes Received:
    33,720
    Gender:
    Male
    Occupation:
    Fun Lovin' Criminal
    Location:
    London
    Home Page:
    • Thanks Thanks x 1
  12. Michlap

    Michlap Regular Member

    Joined:
    Aug 5, 2014
    Messages:
    409
    Likes Received:
    18
    Thank you. but i dont see how this works, as then I have to go through all the URLS and domains manually and delete the URLS that are not needed?
     
  13. Michlap

    Michlap Regular Member

    Joined:
    Aug 5, 2014
    Messages:
    409
    Likes Received:
    18
    I've found another way, I can just make them list from A-Z and delete as I go along. Thanks guys!