1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Extract domains out of text files - reward 10$ bitcoin or paypal

Discussion in 'Hire a Freelancer' started by fabssoouu, Aug 25, 2016.

  1. fabssoouu

    fabssoouu Newbie

    Joined:
    Aug 10, 2016
    Messages:
    19
    Likes Received:
    1
    Hi guys,

    Can some-one please extract all the domains out of the (html) text-files in this link?

    Exclude the following domain in the results: http://www.theperfectwedding.nl/ (and every combination after the slach)

    All the files are in here: (file size is 1,3 gb)

    https://drive.google.com/folderview?id=0B0ROX0vWrG2aaWxqeGZ0Z2JRMFU&usp=sharing
     
    • Thanks Thanks x 1
  2. seoguruseo

    seoguruseo Junior Member

    Joined:
    Aug 2, 2016
    Messages:
    199
    Likes Received:
    11
    Gender:
    Male
    Occupation:
    http://techmantraservices.com/
    Location:
    http://techmantraservices.com/
    Home Page:
    how many domains are there total?
    also its not single file
     
  3. Unknown Overlord

    Unknown Overlord Junior Member

    Joined:
    Nov 7, 2009
    Messages:
    106
    Likes Received:
    48
    I highly doubt someone is going to go through all the trouble of going through all these files and extracting links for $10.
     
  4. fabssoouu

    fabssoouu Newbie

    Joined:
    Aug 10, 2016
    Messages:
    19
    Likes Received:
    1
    There is 1 domain for each file so around 10000+
    So it is not one file. You first have to merge them and then extract i think?
     
  5. fabssoouu

    fabssoouu Newbie

    Joined:
    Aug 10, 2016
    Messages:
    19
    Likes Received:
    1
    Lets hope somebody willl :)
     
  6. seoguruseo

    seoguruseo Junior Member

    Joined:
    Aug 2, 2016
    Messages:
    199
    Likes Received:
    11
    Gender:
    Male
    Occupation:
    http://techmantraservices.com/
    Location:
    http://techmantraservices.com/
    Home Page:
    I can do but question is how u r going to verify all domains?
     
  7. plut0

    plut0 Regular Member

    Joined:
    Aug 2, 2008
    Messages:
    262
    Likes Received:
    60
    how much tld do you want to grab ??
    .com, .net, .org only ?
     
  8. fabssoouu

    fabssoouu Newbie

    Joined:
    Aug 10, 2016
    Messages:
    19
    Likes Received:
    1
    Only one must be extracted tld: .nl
     
  9. fabssoouu

    fabssoouu Newbie

    Joined:
    Aug 10, 2016
    Messages:
    19
    Likes Received:
    1
    In the text files it has to start with: href="http://www.
    And end with tld: .nl

    And exclude this domain: http://www.theperfectwedding.nl/

    If possible show file name (107 for example) and then the domain extracted out of that fill in a excel doc. Only if possible..

    Hope you know enough
     
  10. telim2

    telim2 Regular Member

    Joined:
    Sep 7, 2014
    Messages:
    339
    Likes Received:
    143
    can automate this but the payment is too low for the project. add me on Skype telim221 If you can pay $50 I will develop a bot to auto extract required details
     
  11. plut0

    plut0 Regular Member

    Joined:
    Aug 2, 2008
    Messages:
    262
    Likes Received:
    60
    should be doable. let me play on it in this couple days.
     
  12. cpaforever

    cpaforever Newbie

    Joined:
    Sep 3, 2015
    Messages:
    33
    Likes Received:
    3
    put direct links , no one gong to open all this folders for you.
    split each 50 mb
     
  13. amarindia

    amarindia Power Member

    Joined:
    Dec 5, 2014
    Messages:
    713
    Likes Received:
    38
    Gender:
    Male
    Occupation:
    Freelancer
    Location:
    India
    Home Page:
    i can do... connect on skype: bpo2india

    reg.
     
  14. living2xl

    living2xl Jr. VIP Jr. VIP

    Joined:
    Dec 9, 2011
    Messages:
    1,601
    Likes Received:
    309
    Occupation:
    Sippin dat juice - Shout it louder!
    Location:
    Not sleeping!
    Home Page:
    damn this is a royal pain

    maybe combine all files into one txt file then parse the file for domains and filter for nl
     
  15. Asif WILSON Khan

    Asif WILSON Khan Executive VIP Jr. VIP

    Joined:
    Nov 10, 2012
    Messages:
    11,477
    Likes Received:
    32,409
    Gender:
    Male
    Occupation:
    Fun Lovin' Criminal
    Location:
    London
    Home Page:
  16. MrBlue

    MrBlue Senior Member

    Joined:
    Dec 18, 2009
    Messages:
    971
    Likes Received:
    680
    Occupation:
    Web/Bot Developer
    Not that difficult using GREP and SED

    Code:
    grep http ./yourfilename.txt | sed 's/http/\nhttp/g' | grep ^http | sed 's/\(^http[^ <]*\)\(.*\)/\1/g' | grep IWANTthis | sort -u
    
     
  17. satyawrat

    satyawrat Jr. VIP Jr. VIP

    Joined:
    Jul 8, 2009
    Messages:
    1,112
    Likes Received:
    1,387
    Occupation:
    Hustler
    Location:
    Gurgaon
    Home Page:
    I could do it for free if you give it to me in a single file.
     
  18. tasburrfoot

    tasburrfoot Regular Member

    Joined:
    Dec 16, 2008
    Messages:
    323
    Likes Received:
    152
    Sent you a PM.
     
  19. OrangeNRG

    OrangeNRG Regular Member

    Joined:
    Dec 10, 2012
    Messages:
    374
    Likes Received:
    244
    OP are you retarded? You should pay me $10 just for how many times I pressed the Page Down key...
     
  20. fabssoouu

    fabssoouu Newbie

    Joined:
    Aug 10, 2016
    Messages:
    19
    Likes Received:
    1
    Love u 2