1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How I found a bunch of high ranking expired domains in my niche

Discussion in 'Black Hat SEO Tools' started by catinhat, Oct 5, 2014.

  1. catinhat

    catinhat Newbie

    Joined:
    Sep 25, 2014
    Messages:
    20
    Likes Received:
    10
    Hi,

    I'm still quite new to all this, but have been building a PBN for a few months now. One thing that caused me a lot of headache initially was getting my hands on quality expired domains - There are some services/marketplaces around that sells them, but they are really pricey. After a while I realised that there is a fairly simple way to find them. So, in the spirit of giving back to the community, I figured I would share my findings here.

    Basically, the idea is to scrape directory sites for domains, then check if they are still active and finally make a lookup on moz to find their domain rank. You should be doing more checks than just looking at the moz rank, but it's an ok filter to start out with.

    Now, to get started, I went looking for directory sites within my niche (works really well for country-specific too). There are lots of small directory sites around and you probably already know of a lot, since you would be using them to get links to your site. Large sites are fine, but what you really want is an old directory site - Something where the design screams 1997 is perfect.

    Once you have identified a suitable site, you scrape it. There's probably a gazillion different ways to go about this, but the simplest I know of is using wget from the command line. It won't work with all kinds of sites, but it will work for like 90% if them. The typical problem you might run into is that it never stops, because it keeps following links to the same pages. If that happens, just let it run for a while and then abort - you can still mine the parts it managed to pull down. To mine a site with wget, run this command from a shell (replace $DOMAIN with the actual site you're targeting):

    Code:
    wget --recursive --html-extension --convert-links --domains $DOMAIN
    If you only want to scrape a sub-section of the site, you can use this instead (replace /path as needed):

    Code:
    wget --recursive --html-extension --convert-links --domains $DOMAIN --no-parent $DOMAIN/path
    After wget is done, you should have an offline version of the site sitting on your computer. Next step is finding all domains and gather them into a list. I wrote a small script for that (See end of this post). To use it, run this command from within the directory where you ran wget:

    Code:
    find . -type f -print0|xargs -I{} -0 cat "{}"|~/path/to/find_links.rb|iconv -f LATIN1 -t UTF8|sort|uniq > domains.txt
    You should now have a file called domains.txt that holds all unique domains, one per line.

    For the final step, I wrote another script. It simply does a DNS lookup on each domain to see if it's in use. If the DNS lookup fails, the domain is probably expired, so I do a lookup on moz and store it into a new file. I also do a whois lookup and store that information, while I'm at it. Command to run the script is:

    Code:
    MOZ_AUTH=xxxxxxx cat domains.txt|~/path/to/lookup.rb > available_domains.txt
    Before running this, you need to get a moz authorization cookie. Login (free account is fine) on moz. Then use the browser's developer tools to find the value of the cookie named mozauth. Replace the x's above with this value. This script scrapes moz' site for PA/DA, using your account. I have throttled the script, so it will wait 3 secs between lookups, just to be nice.

    Once you get this far, you can open the final file into excel. Use import file and select tab as separator, et voila.

    Hope this helps someone.

    ---

    Scripts used:

    find_links.rb:
    Code:
    #!/usr/bin/env ruby
    STDIN.read.scan(/href="?https?:\/\/([^\/"\s>]+)/) {|mm| puts mm.first.gsub(/[?\/].*$/, '').gsub(/[#].*$/, '').gsub(/.*?([^.]{4,})(([.][^.]{2,3})+)$/, '\1\2').downcase }
    
    lookup.rb:
    I can't post the script here, because I'm too new and not allowed to put links in my post. So I put it on pastebin instead. Go to that site and put a slash and then 8v87NKu8 after the domain name. (Sorry, maybe an admin can edit this into a real link)
     
    • Thanks Thanks x 8
  2. samiejg

    samiejg Senior Member

    Joined:
    Dec 14, 2013
    Messages:
    825
    Likes Received:
    49
    I don't understand. Is this method supposed to help with owning the domain? Or just scraping the content of the website? Because wouldn't the content be all duplicate?
     
  3. catinhat

    catinhat Newbie

    Joined:
    Sep 25, 2014
    Messages:
    20
    Likes Received:
    10
    It's for finding expired domains that you can then go and register and use to build a pbn.
     
    • Thanks Thanks x 1
  4. zoewarrior

    zoewarrior Regular Member

    Joined:
    Aug 9, 2010
    Messages:
    353
    Likes Received:
    79
    Location:
    Asia
    I am not familiar with wget. How do you run it?
     
  5. Asif WILSON Khan

    Asif WILSON Khan Executive VIP Premium Member

    Joined:
    Nov 10, 2012
    Messages:
    10,119
    Likes Received:
    28,558
    Gender:
    Male
    Occupation:
    Fun Lovin' Criminal
    Location:
    London
    Home Page:
    • Thanks Thanks x 1
  6. nanavlad

    nanavlad Jr. VIP Jr. VIP Premium Member

    Joined:
    Dec 2, 2009
    Messages:
    2,420
    Likes Received:
    892
    Gender:
    Male
    Occupation:
    SEO Consultant
    Location:
    Proxy Central
  7. DX-GENERATION

    DX-GENERATION Jr. VIP Jr. VIP Premium Member

    Joined:
    Apr 14, 2010
    Messages:
    1,151
    Likes Received:
    254
    Need to try this one ! Soon !!!

    Posted via Topify on Android
     
  8. radiant13

    radiant13 Power Member

    Joined:
    May 19, 2010
    Messages:
    500
    Likes Received:
    323
    Nice share op, repped. Will try this out.
     
  9. laur.laurix

    laur.laurix Regular Member

    Joined:
    May 8, 2013
    Messages:
    411
    Likes Received:
    155
    Location:
    Mars
    Is this method more effective than xenu? I think xenu is based on wget.
     
  10. catinhat

    catinhat Newbie

    Joined:
    Sep 25, 2014
    Messages:
    20
    Likes Received:
    10
    Just to clarify - wget - as well as some of the other commands I showed above - are standard unix utilities. You should be able to run them from a Linux or Mac terminal. If you're on Windows, you can install cygwin and use that. The scripts are written in Ruby, so you need to have that installed too (I think it's installed per default on mac).

    Oh - and you also need to set both scripts as executable with the command
    Code:
    chmod +x filename
    before following the guide.
     
  11. catinhat

    catinhat Newbie

    Joined:
    Sep 25, 2014
    Messages:
    20
    Likes Received:
    10
    I'm not familiar with xenu, but it looks like a scraper that finds broken links. That's essentially what the first part of my guide does, so if xenu works for you then you can just stick with that.
     
  12. anteros

    anteros Newbie

    Joined:
    Jun 28, 2009
    Messages:
    29
    Likes Received:
    5
    Thanks for the sharing, however, lookup.rb is giving me errors.

    /home/vagrant/expired.rb:76:in `lookup': Something is wrong (RuntimeError)
    from /home/vagrant/expired.rb:93:in `block in run'
    from /home/vagrant/expired.rb:88:in `each_line'
    from /home/vagrant/expired.rb:88:in `run'
    from /home/vagrant/expired.rb:113:in `<main>'