[Journey] 5 x AI Sites | 5 Different Approaches | 1 Million Cumulative UVs/Month

BlogPro

Elite Member
Jr. Executive VIP
Jr. VIP
Joined
Apr 23, 2012
Messages
2,584
Reaction score
5,434
Hey all,

So - I have been building auto-generated / AI / scraped sites since a long time now.

I train my own AI NLP models to generate text on the fly and to generate context relevant content. I scrape all day, everyday. I have tackled the hardest of niches for the longest of tails.

My sites have been shared on this forum, as well as Reddit and even a couple Russian forums (really proud of that last one).

To read more about me - check the intro post on my Ask Me Anything thread

This journey comes as a sort of a challenge from a couple of fellow webmasters. We were brainstorming an ideation and comparing notes on where we were in our AI / Auto-generation journey - when we decided to launch brand new projects and document our journey as we go along.

New slack channels were created instantly. And we set about. I decided to share and document my journey on this forum as well.

The Journey

I will build 5 sites in total - they'll start from scratch. Each site will be different from the other. I'll try to layout everything about them below.

I have built custom tools, scripts and APIs, which either do all the job from scraping to posting, or are fragmented to do one part of the job.

I'll be using WordPress except for Site # 4 - which is a custom script I put together once to test a prototype.

Monetization

I run several websites - both whitehat and blackhat. I have primarily been doing PPC, CPA, Lead-gen and a little of Adsense. This journey will finally propel me to start using Display.

Let's begin -

_________________________________________________________________________

Site # 1

Site Type -
PAA Only. No rewriting. (Augmented in a few places using AI)

Featured Images - No.

Domain Type - Fresh, Not Registered Before

Current Progress - See below

Indexing API - Yes

----

Site # 2

Site Type
- Fine-tuned AI model generated content + a sprinkling of rewritten PAA using my custom paraphraser. Heavily augmented using AI. (Tons of unique semantically relevant AI Content added to every WordPress Post)

Featured Images - Yes, beautiful unique custom images retrieved from APIs, modified and parsed by my system and posted to the article.

Domain Type - Expired, re-registered. Currently has 20 AI generated posts, all indexed and ranking.

Current Progress - See below

Indexing API - No

----

Site # 3

Site Type -
A unique twist on PAA setup, that I've had some success with. Unable to reveal more.

Featured Images - No

Domain Type - New, never registered before.

Current Progress - See below.

Indexing API - Yes

----

Site # 4

Site Type -
Rewritten PAA site - all answers are paraphrased. (Does not follow the traditional WordPress format)

Featured Images - No

Domain Type - New, never registered before.

Current Progress - See below.

Indexing API - Yes

----

Site # 5

Site Type -
Pure AI Generated Content. No PAA, No questions, No scraping. Just AI generated content.

Featured Images - Yes, beautiful unique custom images retrieved from APIs, modified and parsed by my system and posted to the article.

Domain Type - New, never registered before.

Current Progress - See below.

Indexing API - No

_________________________________________________________________________

Current Progress

Site # 1


- Domain Registered
- Keyword Research done
- 178K keywords extracted, sanitized

//

Site # 2

- Domain Registered (Site was already live)
- Keyword Research done
- 116K keywords extracted, sanitized, grouped into categories

//

Site # 3

- Domain Registered
- Keyword Research done
- 180K keywords extracted, sanitized, grouped into categories

//

Site # 4

- Domain Registration Pending
- Keyword Research done
- 160K keywords extracted, sanitized, grouped into categories

//

Site # 5

- Domain Registered
- Keyword Research done
- 129K keywords extracted, sanitized, grouped into categories

_________________________________________________________________________

People also ask

1.
How are you generating content?

I have my own fine-tuned models (both GPT3 and Non-GPT3). I am always looking at datasets and training models.

2. What do you use for Paraphrasing?

This is more dependent on my end needs. But I end up using T5 a lot - obviously with a lot of custom trained models. If you're looking to get started with paraphrasing - read my post here - https://huggingface.co/ramsrigouthamg/t5-large-paraphraser-diverse-high-quality

Beyond this, @Cognitive has a lovely post on extracting semantics of an article and then using AI to further enhance it. I have something similar that I work with, not to his scale - but it does the job well.

3. Do you connect GSC/GA to this site?

Yes, all my sites have GSC albeit with different accounts. I often alternate between GA and Matomo

4. What language are your scripts in?

Python mostly. Some node.js.

5. Where are your scripts hosted?

I have a couple massive GPU setups in the office to be able to run text-transformers. I also have a few servers at Vultr and AWS.

6. Will you sell your script/setup?

Nope.

7. What about backlinks?

I build social signals (automated) and Web 2.0 - when launching a new site. That's the extent of it for my BH sites.

8. What niche are you working on?

Prefer not to talk about that.

8. I have more questions

Respond here, I'll answer where I can.

_________________________________________________________________________

Also

Three incredible people doing stuff with automation are @Sartre, @spectrejoe and @Preon - you should definitely follow their journey and many of your questions will be answered.

Follow @Sartre's journey here - https://www.blackhatworld.com/seo/j...sing-ai-generated-content-lets-do-it.1360940/

Follow @spectrejoe's journey here - https://www.blackhatworld.com/seo/s...-content-to-700-month-self-coded-bot.1323860/

Follow @Preon's journey here - https://www.blackhatworld.com/seo/s...ontent-to-100-000-page-views-a-month.1313311/

_________________________________________________________________________

Updates

I'll be updating this thread once or twice a week. I'll be answering questions more frequently though.

These are not the only projects I'll be working on. The core idea behind automation is scaling. So I need to keep building new sites plus augmenting my existing ones.

_________________________________________________________________________

Let's fucking go!
 
Last edited:
I will be following your journey, started one similar myself with A.I content but doing it all manual work.

How do you index the links?

Good luck on your journey
 
This will be a great one to follow, good luck mate.
 
I have a couple massive GPU setups in the office to be able to run text-transformers. I also have a few servers at Vultr and AWS.
What and how many GPUs do you use for your models? How large are the models. Can they fit on like a 2x 2080ti with about 24 gigs of VRAM? What do you require for inference?

How long did training them take? Really interested to know about your ml infrastructure.
 
This is insane.
If you can, please launch a paid course or something to teach this (if you want).
Can't really get my eyes off people making $XXX.XXX per month doing AI sites.
Hey, you can learn AI on your own. There are a lot of resources and books available. Even if he sells an e-book, it won't be that easy to understand the whole concept as you imagine.
 
Very interesting journey! Best of luck with it! I'm certainly following it.
5. Where are your scripts hosted?

I have a couple massive GPU setups in the office to be able to run text-transformers.
Just out of curiosity: what GPUs do you use? have you found some new ones at some decent prices? Asking this because, with the current crypto crash, the price of some GPUs is quite affordable compared to the price they had 1-2 years ago.
I bought last week a second-hand RTX3090 for $1200 (was used for minining - not in excellent condition now but it does the job) and added it to our "fleet" of office GPUs that we use for our tests. The same second-hand GPU was sold last year for $3000-$3500. I know RTX3090 is not ideal for ML as it is a gaming GPU but it's still a beefy one :). A100 still costs a fortune, LOL

Also, wanted to ask you what is the keyword research strategy that you use for these sites? Thanks
 
Very interesting journey! Best of luck with it! I'm certainly following it.

Just out of curiosity: what GPUs do you use? have you found some new ones at some decent prices? Asking this because, with the current crypto crash, the price of some GPUs is quite affordable compared to the price they had 1-2 years ago.
I bought last week a second-hand RTX3090 for $1200 (was used for minining - not in excellent condition now but it does the job) and added it to our "fleet" of office GPUs that we use for our tests. The same second-hand GPU was sold last year for $3000-$3500. I know RTX3090 is not ideal for ML as it is a gaming GPU but it's still a beefy one :). A100 still costs a fortune, LOL

Also, wanted to ask you what is the keyword research strategy that you use for these sites? Thanks
why did you buy that GPU? which model do you train?
 
why did you buy that GPU? which model do you train?
Not training anything on RTX3090...just running some pre-trained/ fine-tuned AI models on it (eg: GPT-J 6B). It's too slow for training medium size AI models like GPT-J. Actually, if someone would like to test the fine-tuning of this model with 8-bit weights I don't think it would be that slow...but I am not convinced the quality is the same with this version.

https://huggingface.co/hivemind/gpt-j-6B-8bit
 
Not training anything on RTX3090...just running some pre-trained/ fine-tuned AI models on it (eg: GPT-J 6B). It's too slow for training medium size AI models like GPT-J. Actually, if someone would like to test the fine-tuning of this model with 8-bit weights I don't think it would be that slow...but I am not convinced the quality is the same with this version.

https://huggingface.co/hivemind/gpt-j-6B-8bit
How is the speed of execution? I tried it using CPU and it was very slow and also the output was not as expected.
 
I will be following your journey, started one similar myself with A.I content but doing it all manual work.

How do you index the links?

Good luck on your journey

For WordPress it's relatively easy - I use Rankmath's Instant Indexing

For Non-Wordpress, I built a small script that scrapes new URLs + submits them to Google every 48 hours. Runs on a Cron.

This is insane.
If you can, please launch a paid course or something to teach this (if you want).
Can't really get my eyes off people making $XXX.XXX per month doing AI sites.

I am not a huge fan of the Guru model where I issue a one-time paid course. I might do something with a follow-along or a setup service later.

Cheers man.

This will be an awesome journey. I hope you reach your goals soon. :)

I would like to know more about how you find images and post them on your sites. I've hit a roadblock with the image insertion part.

I use the API from Unsplash, Pexels and a few others and use Python Libraries to process them and turn them into featured images.

What and how many GPUs do you use for your models? How large are the models. Can they fit on like a 2x 2080ti with about 24 gigs of VRAM? What do you require for inference?

How long did training them take? Really interested to know about your ml infrastructure.

My local machines are exclusively for inference. I actually have one with a 2080ti setup and another 2 x RTX3090.

For training, I use Amazon Sagemaker. It saves every trained model as an S3 and if you know what you're doing, you can download the trained models to use in your local configuration (on a local system or on a different/cheaper VM)

Very interesting journey! Best of luck with it! I'm certainly following it.

Just out of curiosity: what GPUs do you use? have you found some new ones at some decent prices? Asking this because, with the current crypto crash, the price of some GPUs is quite affordable compared to the price they had 1-2 years ago.
I bought last week a second-hand RTX3090 for $1200 (was used for minining - not in excellent condition now but it does the job) and added it to our "fleet" of office GPUs that we use for our tests. The same second-hand GPU was sold last year for $3000-$3500. I know RTX3090 is not ideal for ML as it is a gaming GPU but it's still a beefy one :). A100 still costs a fortune, LOL

Also, wanted to ask you what is the keyword research strategy that you use for these sites? Thanks

I answered above. 1x2080ti and 2x3090.

Like you, 3090 were my recent acquisitions as well. And were bought used as well. As long as they get the job done, right?

And thank god the prices are lower now.

Even then though, I am moving a lot to MLaaS for now. They let me follow the PAAG model and running tests/prototype etc. is simple enough.

I have been eyeing the A100 too. But honestly, it'd be overkill for me right now.

//

Since this is an "output oriented" project (I generate, publish and push out content into the wild) vs. a "process oriented" project - wherein a large model is trained to perform a task or generate outputs in a closed environment. I think my requirements are satiated with what I have right now.

The biggest model I am currently working on/with is T0PP - it's a pretty neat transformer model. And at 40+ Gigs is huge. Trying to see what I can make it do.
 
How is the speed of execution? I tried it using CPU and it was very slow and also the output was not as expected.
Don't know ...never tried the 8-bit version on CPU. But I guess it will still take a lot of time to generate something on CPU. But if you decide to run it on GPU, an 8GB one (like GTX1080) should work fine. 1080Ti should be enough for fine-tuning this version.
 
How is the speed of execution? I tried it using CPU and it was very slow and also the output was not as expected.
I dont think it will affect speed (especially using cpu), 8bit only affect memory usage iirc
 
Don't know ...never tried the 8-bit version on CPU. But I guess it will still take a lot of time to generate something on CPU. But if you decide to run it on GPU, an 8GB one (like GTX1080) should work fine. 1080Ti should be enough for fine-tuning this version.
How many days would it take to train GPT-J-6B model using 1080Ti GPU?

I'm planning to learn more about it. I thought it would cost more to train a model.
 
Back
Top