I know there are multiple sites that will list all the domains that have recently expired, but my question is, where do these websites source this data from?
I had the same question sometime ago. So I went downt the domain hunting rabbit hole and trust me, there's so much work put into acquiring this data that it will take several threads to explain it properly.
In brief, these are the most common techniques:
Zonefile Hunting
Every Top Level Domain (like .com, .net, .xyz etc) is maintained by a company. This company's responsibility is to keep what is called a "
zonefile". The zonefile contains a list of all domain names under that TLD that are online on a particular day and it is updated every 24hrs.
So, if you have the .COM zonefile (which is maintained by Verisign), you can get all the domains which have a site associated to then today.
By comparing these zonefiles across several days, you can get a good picture of how many domain names a TLD has. .COM is the largest, I have access to that zonefile and as of today, the file is about 23 gigs large and has roughly 200 million sites. You can then get whois records of all these sites and see when the lease expires. And check on the expiration date again if the domain was renewed or not.
Backlink Profile Scanning
Seo services like Ahrefs and Majestic have crawlers for backlinks. By checking the DNS probe status of outgoing backlinks of popular sites, you can see which have expired.
Private Crawlers
Same as above, but instead of using seo backlink data. You setup your own crawlers to analyse large websites. This is usually done only when you want domains that have backlinks from certain sites.
Serp hunting
There will be serps (often of forums or message boards) mentioning old, now expired websites. By scraping serps for these mentions you can find some expired domains.
Social media mentions
Just like the above, but instead of searching in serps. You search social media sites and filter for 404s.
Auction and Registrar Lists
Several registrars auction domain names as soon as it reaches the pending delete status. You can get these lists from their APIs and stuff.
Insider Info
In the end, there's insider Info that only registrars and big-name Domainers inside the industry know.