- Apr 23, 2012
- 2,584
- 5,434
Hey all,
So - I have been building auto-generated / AI / scraped sites since a long time now.
I train my own AI NLP models to generate text on the fly and to generate context relevant content. I scrape all day, everyday. I have tackled the hardest of niches for the longest of tails.
My sites have been shared on this forum, as well as Reddit and even a couple Russian forums (really proud of that last one).
To read more about me - check the intro post on my Ask Me Anything thread
This journey comes as a sort of a challenge from a couple of fellow webmasters. We were brainstorming an ideation and comparing notes on where we were in our AI / Auto-generation journey - when we decided to launch brand new projects and document our journey as we go along.
New slack channels were created instantly. And we set about. I decided to share and document my journey on this forum as well.
The Journey
I will build 5 sites in total - they'll start from scratch. Each site will be different from the other. I'll try to layout everything about them below.
I have built custom tools, scripts and APIs, which either do all the job from scraping to posting, or are fragmented to do one part of the job.
I'll be using WordPress except for Site # 4 - which is a custom script I put together once to test a prototype.
Monetization
I run several websites - both whitehat and blackhat. I have primarily been doing PPC, CPA, Lead-gen and a little of Adsense. This journey will finally propel me to start using Display.
Let's begin -
_________________________________________________________________________
Site # 1
Site Type - PAA Only. No rewriting. (Augmented in a few places using AI)
Featured Images - No.
Domain Type - Fresh, Not Registered Before
Current Progress - See below
Indexing API - Yes
----
Site # 2
Site Type - Fine-tuned AI model generated content + a sprinkling of rewritten PAA using my custom paraphraser. Heavily augmented using AI. (Tons of unique semantically relevant AI Content added to every WordPress Post)
Featured Images - Yes, beautiful unique custom images retrieved from APIs, modified and parsed by my system and posted to the article.
Domain Type - Expired, re-registered. Currently has 20 AI generated posts, all indexed and ranking.
Current Progress - See below
Indexing API - No
----
Site # 3
Site Type - A unique twist on PAA setup, that I've had some success with. Unable to reveal more.
Featured Images - No
Domain Type - New, never registered before.
Current Progress - See below.
Indexing API - Yes
----
Site # 4
Site Type - Rewritten PAA site - all answers are paraphrased. (Does not follow the traditional WordPress format)
Featured Images - No
Domain Type - New, never registered before.
Current Progress - See below.
Indexing API - Yes
----
Site # 5
Site Type - Pure AI Generated Content. No PAA, No questions, No scraping. Just AI generated content.
Featured Images - Yes, beautiful unique custom images retrieved from APIs, modified and parsed by my system and posted to the article.
Domain Type - New, never registered before.
Current Progress - See below.
Indexing API - No
_________________________________________________________________________
Current Progress
Site # 1
- Domain Registered
- Keyword Research done
- 178K keywords extracted, sanitized
//
Site # 2
- Domain Registered (Site was already live)
- Keyword Research done
- 116K keywords extracted, sanitized, grouped into categories
//
Site # 3
- Domain Registered
- Keyword Research done
- 180K keywords extracted, sanitized, grouped into categories
//
Site # 4
- Domain Registration Pending
- Keyword Research done
- 160K keywords extracted, sanitized, grouped into categories
//
Site # 5
- Domain Registered
- Keyword Research done
- 129K keywords extracted, sanitized, grouped into categories
_________________________________________________________________________
People also ask
1. How are you generating content?
I have my own fine-tuned models (both GPT3 and Non-GPT3). I am always looking at datasets and training models.
2. What do you use for Paraphrasing?
This is more dependent on my end needs. But I end up using T5 a lot - obviously with a lot of custom trained models. If you're looking to get started with paraphrasing - read my post here - https://huggingface.co/ramsrigouthamg/t5-large-paraphraser-diverse-high-quality
Beyond this, @Cognitive has a lovely post on extracting semantics of an article and then using AI to further enhance it. I have something similar that I work with, not to his scale - but it does the job well.
3. Do you connect GSC/GA to this site?
Yes, all my sites have GSC albeit with different accounts. I often alternate between GA and Matomo
4. What language are your scripts in?
Python mostly. Some node.js.
5. Where are your scripts hosted?
I have a couple massive GPU setups in the office to be able to run text-transformers. I also have a few servers at Vultr and AWS.
6. Will you sell your script/setup?
Nope.
7. What about backlinks?
I build social signals (automated) and Web 2.0 - when launching a new site. That's the extent of it for my BH sites.
8. What niche are you working on?
Prefer not to talk about that.
8. I have more questions
Respond here, I'll answer where I can.
_________________________________________________________________________
Also
Three incredible people doing stuff with automation are @Sartre, @spectrejoe and @Preon - you should definitely follow their journey and many of your questions will be answered.
Follow @Sartre's journey here - https://www.blackhatworld.com/seo/j...sing-ai-generated-content-lets-do-it.1360940/
Follow @spectrejoe's journey here - https://www.blackhatworld.com/seo/s...-content-to-700-month-self-coded-bot.1323860/
Follow @Preon's journey here - https://www.blackhatworld.com/seo/s...ontent-to-100-000-page-views-a-month.1313311/
_________________________________________________________________________
Updates
I'll be updating this thread once or twice a week. I'll be answering questions more frequently though.
These are not the only projects I'll be working on. The core idea behind automation is scaling. So I need to keep building new sites plus augmenting my existing ones.
_________________________________________________________________________
Let's fucking go!
So - I have been building auto-generated / AI / scraped sites since a long time now.
I train my own AI NLP models to generate text on the fly and to generate context relevant content. I scrape all day, everyday. I have tackled the hardest of niches for the longest of tails.
My sites have been shared on this forum, as well as Reddit and even a couple Russian forums (really proud of that last one).
To read more about me - check the intro post on my Ask Me Anything thread
This journey comes as a sort of a challenge from a couple of fellow webmasters. We were brainstorming an ideation and comparing notes on where we were in our AI / Auto-generation journey - when we decided to launch brand new projects and document our journey as we go along.
New slack channels were created instantly. And we set about. I decided to share and document my journey on this forum as well.
The Journey
I will build 5 sites in total - they'll start from scratch. Each site will be different from the other. I'll try to layout everything about them below.
I have built custom tools, scripts and APIs, which either do all the job from scraping to posting, or are fragmented to do one part of the job.
I'll be using WordPress except for Site # 4 - which is a custom script I put together once to test a prototype.
Monetization
I run several websites - both whitehat and blackhat. I have primarily been doing PPC, CPA, Lead-gen and a little of Adsense. This journey will finally propel me to start using Display.
Let's begin -
_________________________________________________________________________
Site # 1
Site Type - PAA Only. No rewriting. (Augmented in a few places using AI)
Featured Images - No.
Domain Type - Fresh, Not Registered Before
Current Progress - See below
Indexing API - Yes
----
Site # 2
Site Type - Fine-tuned AI model generated content + a sprinkling of rewritten PAA using my custom paraphraser. Heavily augmented using AI. (Tons of unique semantically relevant AI Content added to every WordPress Post)
Featured Images - Yes, beautiful unique custom images retrieved from APIs, modified and parsed by my system and posted to the article.
Domain Type - Expired, re-registered. Currently has 20 AI generated posts, all indexed and ranking.
Current Progress - See below
Indexing API - No
----
Site # 3
Site Type - A unique twist on PAA setup, that I've had some success with. Unable to reveal more.
Featured Images - No
Domain Type - New, never registered before.
Current Progress - See below.
Indexing API - Yes
----
Site # 4
Site Type - Rewritten PAA site - all answers are paraphrased. (Does not follow the traditional WordPress format)
Featured Images - No
Domain Type - New, never registered before.
Current Progress - See below.
Indexing API - Yes
----
Site # 5
Site Type - Pure AI Generated Content. No PAA, No questions, No scraping. Just AI generated content.
Featured Images - Yes, beautiful unique custom images retrieved from APIs, modified and parsed by my system and posted to the article.
Domain Type - New, never registered before.
Current Progress - See below.
Indexing API - No
_________________________________________________________________________
Current Progress
Site # 1
- Domain Registered
- Keyword Research done
- 178K keywords extracted, sanitized
//
Site # 2
- Domain Registered (Site was already live)
- Keyword Research done
- 116K keywords extracted, sanitized, grouped into categories
//
Site # 3
- Domain Registered
- Keyword Research done
- 180K keywords extracted, sanitized, grouped into categories
//
Site # 4
- Domain Registration Pending
- Keyword Research done
- 160K keywords extracted, sanitized, grouped into categories
//
Site # 5
- Domain Registered
- Keyword Research done
- 129K keywords extracted, sanitized, grouped into categories
_________________________________________________________________________
People also ask
1. How are you generating content?
I have my own fine-tuned models (both GPT3 and Non-GPT3). I am always looking at datasets and training models.
2. What do you use for Paraphrasing?
This is more dependent on my end needs. But I end up using T5 a lot - obviously with a lot of custom trained models. If you're looking to get started with paraphrasing - read my post here - https://huggingface.co/ramsrigouthamg/t5-large-paraphraser-diverse-high-quality
Beyond this, @Cognitive has a lovely post on extracting semantics of an article and then using AI to further enhance it. I have something similar that I work with, not to his scale - but it does the job well.
3. Do you connect GSC/GA to this site?
Yes, all my sites have GSC albeit with different accounts. I often alternate between GA and Matomo
4. What language are your scripts in?
Python mostly. Some node.js.
5. Where are your scripts hosted?
I have a couple massive GPU setups in the office to be able to run text-transformers. I also have a few servers at Vultr and AWS.
6. Will you sell your script/setup?
Nope.
7. What about backlinks?
I build social signals (automated) and Web 2.0 - when launching a new site. That's the extent of it for my BH sites.
8. What niche are you working on?
Prefer not to talk about that.
8. I have more questions
Respond here, I'll answer where I can.
_________________________________________________________________________
Also
Three incredible people doing stuff with automation are @Sartre, @spectrejoe and @Preon - you should definitely follow their journey and many of your questions will be answered.
Follow @Sartre's journey here - https://www.blackhatworld.com/seo/j...sing-ai-generated-content-lets-do-it.1360940/
Follow @spectrejoe's journey here - https://www.blackhatworld.com/seo/s...-content-to-700-month-self-coded-bot.1323860/
Follow @Preon's journey here - https://www.blackhatworld.com/seo/s...ontent-to-100-000-page-views-a-month.1313311/
_________________________________________________________________________
Updates
I'll be updating this thread once or twice a week. I'll be answering questions more frequently though.
These are not the only projects I'll be working on. The core idea behind automation is scaling. So I need to keep building new sites plus augmenting my existing ones.
_________________________________________________________________________
Let's fucking go!
Last edited: