He is one of the guys who make PAA question/answer Spam sites. And here is his Process.

@Sartre
Sorry if they has been asked before, but how many pages are your sites usually?
I'm using the approach of making sites woth few hundred to few thousand cherry picked low comp keywords and then hopefully turning the best achievers WH. Instead of the more risky method of making 200-500k lower quality articles.
 
I'm using the approach of making sites woth few hundred to few thousand cherry picked low comp keywords and then hopefully turning the best achievers WH. Instead of the more risky method of making 200-500k lower quality articles.
I have the same concern, i have my eyes on a blog that went from 0 traffic to +10 million in less than a year, about 50k posts, i think it's crazy and risky but the content is medium quality not bad.
 
I'm using the approach of making sites woth few hundred to few thousand cherry picked low comp keywords and then hopefully turning the best achievers WH. Instead of the more risky method of making 200-500k lower quality articles.

Wow, that's a very neat method. But how do you proceed with making it WH? Do you just delete all the posts from the site and rewrite the ones performing well? Or do you build a new site targeting those well-performing keywords?
 
I have the same concern, i have my eyes on a blog that went from 0 traffic to +10 million in less than a year, about 50k posts, i think it's crazy and risky but the content is medium quality not bad.
then diversify, make more blogs, direct the traffic somewhere, reinvest money into white hat ventures. so many things to do mate!!

@RealDaddy

Yeah exactly i either have a rewriter make the performing articles WH or I use the keyword on an already WH site.
 
then diversify, make more blogs, direct the traffic somewhere, reinvest money into white hat ventures. so many things to do mate!!

@RealDaddy

Yeah exactly i either have a rewriter make the performing articles WH or I use the keyword on an already WH site.

Damn, you are a smart-ass SEO :D
 
@Sartre I have found if you scale content too fast/too much with mass local seo your crawl budget cannot keep up and usually gets maxed out at 20% index rate.

I have been working on my own indexer mixing together Reddit and Twitter methods shared from a G news site network with Rank Math api index drip.

Some things I recently just finished on my stack:

Automating the internal linking to pair semantic content groups.

Automating the internal link anchors to randomly refresh to similar variations and long tails on each unique page load.

Automating the generation of rich snippet FAQ schema for every page and some others to stack up more SERP space.

Implemented fingerprintjs and a solution similar to Geotargetly to detect users geo down to their city, zip/postal code. Display hyper localized content that matches exactly where they are searching from. G sometimes has no clue and ranks #1 some random town near me page for the main near me keyword but now all that traffic is hitting the wrong geo/content for them. Bad user intent. So it auto redirects them to the right page ect.

Automating authority outbound links by scraping as many top 50 authority sites (that aren't competitors) ranking for the main keywords I am targeting, parse out the outbound urls and rotate thru the list adding 2-3 each page.

My keyword research is done by scraping auto suggest for main keywords, remove dups, keyword explore on Ahrefs with matching terms filter on, apply KGR filter, bulk check for less than 10 allintitle results on each keyword, and then start content generation.

I took some python lessons for the first time recently which helped alot understand the bottlenecks in my production, would highly recommend any one to try if there are on the fence!

Thanks for sharing all the insights!
 
@Sartre I have found if you scale content too fast/too much with mass local seo your crawl budget cannot keep up and usually gets maxed out at 20% index rate.

I have been working on my own indexer mixing together Reddit and Twitter methods shared from a G news site network with Rank Math api index drip.

Some things I recently just finished on my stack:

Automating the internal linking to pair semantic content groups.

Automating the internal link anchors to randomly refresh to similar variations and long tails on each unique page load.

Automating the generation of rich snippet FAQ schema for every page and some others to stack up more SERP space.

Implemented fingerprintjs and a solution similar to Geotargetly to detect users geo down to their city, zip/postal code. Display hyper localized content that matches exactly where they are searching from. G sometimes has no clue and ranks #1 some random town near me page for the main near me keyword but now all that traffic is hitting the wrong geo/content for them. Bad user intent. So it auto redirects them to the right page ect.

Automating authority outbound links by scraping as many top 50 authority sites (that aren't competitors) ranking for the main keywords I am targeting, parse out the outbound urls and rotate thru the list adding 2-3 each page.

My keyword research is done by scraping auto suggest for main keywords, remove dups, keyword explore on Ahrefs with matching terms filter on, apply KGR filter, bulk check for less than 10 allintitle results on each keyword, and then start content generation.

I took some python lessons for the first time recently which helped alot understand the bottlenecks in my production, would highly recommend any one to try if there are on the fence!

Thanks for sharing all the insights!
I see you know your stuff. Isn't dripping 200 articles/day enough for you?

Do you do internal links after posting the content using REST API?

I wouldn't randomize anchors I think that's unnatural and might be a red flag for Google.

Q&A FAQ is great of course.

Fantastic stuff man!!
 
I see you know your stuff. Isn't dripping 200 articles/day enough for you?

Do you do internal links after posting the content using REST API?

I wouldn't randomize anchors I think that's unnatural and might be a red flag for Google.

Q&A FAQ is great of course.

Fantastic stuff man!!

It's never enough! Hah

I am starting to build national niche directories and you can imagine the amount of pages, for example in Canada there's about 9.4k towns/cities x 200 niche keywords with a page dedicated for every town. Which all have French pages on /fr/ generated by G Translate. The sites end up 2M to 3M pages.

During content generation, I use spintax to append my internal links, outbound links and to hyper localized the content. Pulls from different compiled link files.

Regarding the randomizing internal link anchors, yes there may be some footprint there, but then again G does this themselves with PAA and many other related search widgets. There are some large mass Q/A sites that auto rotate in internal links in various ways, one menu based on geo, another based on same city/different service, and then a random 'related links'. So you are right, I only do this as max 20% of my total internal links and have seen decent boosts in long tail SERPs.

I have found that the first 10 contextual internal links on a page will carry 80% of the juice in a top down fashion. So I created a list of many different generic sentences that my priority internal anchors/links can fit into and during content generation slice them into sections of the content.
 
Last edited:
do you use that?
imo KGR is broscience and completely doesn't work for some keywords. We can come up with better metrics.

There are keywords with few sites competing for very lucrative clicks, like local lawyers, that would have a good KGR, yet are very hard to rank for.
 
It's never enough! Hah

I am starting to build national niche directories and you can imagine the amount of pages, for example in Canada there's about 9.4k towns/cities x 200 niche keywords with a page dedicated for every town. Which all have French pages on /fr/ generated by G Translate. The sites end up 2M to 3M pages.

During content generation, I use spintax to append my internal links, outbound links and to hyper localized the content. Pulls from different compiled link files.

Regarding the randomizing internal link anchors, yes there may be some footprint there, but then again G does this themselves with PAA and many other related search widgets. There are some large mass Q/A sites that auto rotate in internal links in various ways, one menu based on geo, another based on same city/different service, and then a random 'related links'. So you are right, I only do this as max 20% of my total internal links and have seen decent boosts in long tail SERPs.

I have found that the first 10 contextual internal links on a page will carry 80% of the juice in a top down fashion. So I created a list of many different generic sentences that my priority internal anchors/links can fit into and during content generation slice them into sections of the content.
local SEO is genius idea, im getting into it now too + datasets.
 
do you use that?

Yes, and then I filter terms by highest CPC and prioritize content generstion from there.

The rest I leave up to G to see what sticks, and as a few others have shared on BHW, monitor what pages/keywords pop and begin revising those pages with better content, add internal links from home page and focus offpage seo on them.

The most valuable tool I use everyday must be the G autosuggest scraper.

You will miss at least 30% of the freshest search data if you solely rely on semrush and ahrefs.
 
Yes, and then I filter terms by highest CPC and prioritize content generstion from there.

The rest I leave up to G to see what sticks, and as a few others have shared on BHW, monitor what pages/keywords pop and begin revising those pages with better content, add internal links from home page and focus offpage seo on them.

The most valuable tool I use everyday must be the G autosuggest scraper.

You will miss at least 30% of the freshest search data if you solely rely on semrush and ahrefs.
Interesting, I never thought about filtering by highest CPC.

how do you scrape autosuggest?

like:
[a-z][a-z] keyword
keyword [a-z][a-z]
?

I used to do this but in the end using only PAAs right now and finding more than enough zero-to-low comp keywords
 
Interesting, I never thought about filtering by highest CPC.

how do you scrape autosuggest?

like:
[a-z][a-z] keyword
keyword [a-z][a-z]
?

I used to do this but in the end using only PAAs right now and finding more than enough zero-to-low comp keywords
Yes, exactly, you can go as deep as you want, but I have found 3 levels deep is plenty.
"keyword [1-9/a-z][1-9/a-z][1-9/a-z]" and it automatically does [1-9/a-z] keyword [1-9/a-z][1-9/a-z], to [1-9/a-z][1-9/a-z][1-9/a-z] keyword ect.

Free tool, can run on your comp in the background at same time as normal computer use, it's fast as hell too.
Add 10 proxies into it, you can scrape 30k auto suggest lists in an hour.

I'm not affiliated with this software, it's been shared many times here on BHW.

Best settings for no proxy:
1 thread
300 delay
15000 timeout

Best settings for proxy support:
same threads as proxies
300 delay
10000 timeout

Max fails: 3
On no proxy: Direct Slow

Code:
https://www.epigrade.com/en/products/gsug-keyword-scraper/

I use this tool more than anything else. After the scrape is finished, you can remove duplicates, but I like to save two version, one with all the raw data and another with duplicates removed.

Then I run a scan to count duplicates on the raw autosuggest data, can use free tools available:
Code:
https://www.somacon.com/p568.php

And I have found the higher the duplication for autosuggest keywords the better the keyword to target/higher search volume, so I prioritize content generation based on that as well.

Taking it a step further, import your autosuggest list into a cluster grouping tool, I have developed a semantic one that goes based off entities and other properties G uses, but even just free keyword cluster works fine:
Code:
https://app.contadu.com/tools/keywords-grouper

bhw01.PNG

From here, you can begin segmentation of your content, start defining your silo url structure ect.
 
Yes, exactly, you can go as deep as you want, but I have found 3 levels deep is plenty.
"keyword [1-9/a-z][1-9/a-z][1-9/a-z]" and it automatically does [1-9/a-z] keyword [1-9/a-z][1-9/a-z], to [1-9/a-z][1-9/a-z][1-9/a-z] keyword ect.

Free tool, can run on your comp in the background at same time as normal computer use, it's fast as hell too.
Add 10 proxies into it, you can scrape 30k auto suggest lists in an hour.

I'm not affiliated with this software, it's been shared many times here on BHW.

Best settings for no proxy:
1 thread
300 delay
15000 timeout

Best settings for proxy support:
same threads as proxies
300 delay
10000 timeout

Max fails: 3
On no proxy: Direct Slow

Code:
https://www.epigrade.com/en/products/gsug-keyword-scraper/

I use this tool more than anything else. After the scrape is finished, you can remove duplicates, but I like to save two version, one with all the raw data and another with duplicates removed.

Then I run a scan to count duplicates on the raw autosuggest data, can use free tools available:
Code:
https://www.somacon.com/p568.php

And I have found the higher the duplication for autosuggest keywords the better the keyword to target/higher search volume, so I prioritize content generation based on that as well.

Taking it a step further, import your autosuggest list into a cluster grouping tool, I have developed a semantic one that goes based off entities and other properties G uses, but even just free keyword cluster works fine:
Code:
https://app.contadu.com/tools/keywords-grouper

View attachment 216949

From here, you can begin segmentation of your content, start defining your silo url structure ect.

What a delightful post.

I have some responses to this, but I am half sleepy.
 
Yes, exactly, you can go as deep as you want, but I have found 3 levels deep is plenty.
"keyword [1-9/a-z][1-9/a-z][1-9/a-z]" and it automatically does [1-9/a-z] keyword [1-9/a-z][1-9/a-z], to [1-9/a-z][1-9/a-z][1-9/a-z] keyword ect.

Free tool, can run on your comp in the background at same time as normal computer use, it's fast as hell too.
Add 10 proxies into it, you can scrape 30k auto suggest lists in an hour.

I'm not affiliated with this software, it's been shared many times here on BHW.

Best settings for no proxy:
1 thread
300 delay
15000 timeout

Best settings for proxy support:
same threads as proxies
300 delay
10000 timeout

Max fails: 3
On no proxy: Direct Slow

Code:
https://www.epigrade.com/en/products/gsug-keyword-scraper/

I use this tool more than anything else. After the scrape is finished, you can remove duplicates, but I like to save two version, one with all the raw data and another with duplicates removed.

Then I run a scan to count duplicates on the raw autosuggest data, can use free tools available:
Code:
https://www.somacon.com/p568.php

And I have found the higher the duplication for autosuggest keywords the better the keyword to target/higher search volume, so I prioritize content generation based on that as well.

Taking it a step further, import your autosuggest list into a cluster grouping tool, I have developed a semantic one that goes based off entities and other properties G uses, but even just free keyword cluster works fine:
Code:
https://app.contadu.com/tools/keywords-grouper

View attachment 216949

From here, you can begin segmentation of your content, start defining your silo url structure ect.
wow that's pretty cool. I already had autosuggest support coded into my app using the firefox toolbar exploit but stopped using them since PAA scraping was so much more time/cost efficient for finding good keywords. I might try again. thanks
 
Back
Top
AdBlock Detected

We get it, advertisements are annoying!

Sure, ad-blocking software does a great job at blocking ads, but it also blocks useful features and essential functions on BlackHatWorld and other forums. These functions are unrelated to ads, such as internal links and images. For the best site experience please disable your AdBlocker.

I've Disabled AdBlock