[Journey] 1 million UVs/month in 12 months using AI generated content. Let's do it!

spectrejoe

Jr. VIP
Jr. VIP
Joined
Sep 25, 2013
Messages
3,906
Reaction score
1,965
Website
skyrocket-growth.com
Haha! I believe I know you from somewhere. Something you mentioned in the above post just clicked.

Been building PAA sites for a year now. My largest one is 182k pages in an extremely competitive niche. With around 120k indexed.

Been using GPT3 + a custom fine tuned GPTNeoX for content generation. Pegasus for rephrasing.

QA Format is very similar to yours as well.

I also harness the qapage schema in some instances.

Setup is very similar to yours. My tweaking and optimization is towards a different direction, though.

I use a different image generation library that gives me better shadows, more sigma control and crisper fonts.

I don't use the Indexing API (at least haven't yet, but may just use it as a test today on a brand new site).

All, but 5 of my sites, are on brand new Domains with zero back link profiles.

Here's a site with its first post 4 days ago (domain regged same day).

Screen-Shot-2022-06-13-at-8-21-04-PM-1.png


No backlinks, brand new domain, some social signals and web 2.0, again very competitive niche.

Happy to connect and brainstorm if you're interested.
How well are your sites performing despite having 0 links?
 

Almostbr0ke

Junior Member
Joined
Jan 11, 2021
Messages
167
Reaction score
37
Is this hard to set it up? I am using wpx like a noob
not if you know how to configure wordpress/caching/cloudflare. WP with PHP 7+ can easily handle 50k posts and millions of monthly views on a $20 droplet.
 

Sartre

Jr. VIP
Jr. VIP
Joined
Apr 1, 2010
Messages
530
Reaction score
560
Website
NoSandbox.com
Thanks @Sartre for your sharing method, I completely read all comments in the thread after 3 days, there are lots of state-of-the-art information.

I have some concerns about your app:
1. How to create title and subheading of your auto-generated article?
2. Your app used Levenshtein distance for finding similar text between paragraphs. Did you test alternative algorithm like Spacy cosine similarity or fastText WMD algorithm (WMD = Word Mover's Distance)?
3. Did you use playwright-python package instead of Selenium?
playwright - damn this is good

I think I replied to the other ones before. Remind me if i didn't, please.
 

tikaku

Newbie
Joined
Jun 7, 2013
Messages
14
Reaction score
1
playwright - damn this is good

I think I replied to the other ones before. Remind me if i didn't, please.
Yes, you replied my comment before, thanks @Sartre for your support. I asked question about playwright-python and Selenium for you to test other awesome Python to automate a browser. Playwright is fast and great community. Also, NodeJS have better libraries for scraper like puppeteer and puppeteer-extra-plugin-stealth to pass bot tests like Cloudflare.
 
Last edited:

Sartre

Jr. VIP
Jr. VIP
Joined
Apr 1, 2010
Messages
530
Reaction score
560
Website
NoSandbox.com
Yes, you replied my comment before, thanks @Sartre for your support. I asked question about playwright-python and Selenium for you to test other awesome Python to automate a browser. Playwright is fast and great community. Also, NodeJS have better libraries for scraper like puppeteer and puppeteer-extra-plugin-stealth to pass bot tests like Cloudflare.
I know JS is cool but looking at the syntax after working for so long with Python gives me cancer. I literally can't do it.

Is there some version of Node that doesn't use {} and ; in its syntax? :D
 

tikaku

Newbie
Joined
Jun 7, 2013
Messages
14
Reaction score
1
I know JS is cool but looking at the syntax after working for so long with Python gives me cancer. I literally can't do it.

Is there some version of Node that doesn't use {} and ; in its syntax? :D
There is a Sartre version that would be tailored to your requirements.
 

GGOH

Newbie
Joined
Jun 9, 2022
Messages
3
Reaction score
1
Thank you for coding that :D

You have surely noticed by now that Node is absolutely single-threaded, and thus not that useful ;).

Thanks for this writeup, have read yours and Preon's thread from beginning to end. I'm sure as we get further along people will blame all of these tools for "ruining the internet" (anything to avoid blaming ads, which are the true culprit).
 

Sartre

Jr. VIP
Jr. VIP
Joined
Apr 1, 2010
Messages
530
Reaction score
560
Website
NoSandbox.com
You have surely noticed by now that Node is absolutely single-threaded, and thus not that useful ;).

Thanks for this writeup, have read yours and Preon's thread from beginning to end. I'm sure as we get further along people will blame all of these tools for "ruining the internet" (anything to avoid blaming ads, which are the true culprit).
which tools? Node and Python? lol
 

GGOH

Newbie
Joined
Jun 9, 2022
Messages
3
Reaction score
1
which tools? Node and Python? lol

The Python interpreter can be spawned in parallel, though, which sidesteps the process lock. Whereas Node is absolutely a single-threaded parser (afaik... I used the term "absolutely" with intent in the above post, having built some mockups in NodeRed for proof of concept, but... the performance is always shit and like you I don't care for Javascript... at all).

This guide has a relevant example (multi-cpu thread scraping of images).

https://towardsdatascience.com/multithreading-multiprocessing-python-180d0975ab29
 
Last edited:

Sartre

Jr. VIP
Jr. VIP
Joined
Apr 1, 2010
Messages
530
Reaction score
560
Website
NoSandbox.com
The Python interpreter can be spawned in parallel, though, which sidesteps the process lock. Whereas Node is absolutely a single-threaded parser (afaik... I used the term "absolutely" with intent in the above post, having built some mockups in NodeRed for proof of concept, but... the performance is always shit and like you I don't care for Javascript... at all).

This guide has a relevant example (multi-cpu thread scraping of images).

I'm using multiprocessing with Python. Very simple. The bottle neck is Google anyway. But also I can't imagine ever wanting so many articles or having so many clients that I would need more than an AMD Epyc 7451 24c/48t. This stuff can run 100-200 parallel Selenium processes.
 

haykuroo

Newbie
Joined
Feb 23, 2016
Messages
17
Reaction score
4
I'm using multiprocessing with Python. Very simple. The bottle neck is Google anyway. But also I can't imagine ever wanting so many articles or having so many clients that I would need more than an AMD Epyc 7451 24c/48t. This stuff can run 100-200 parallel Selenium processes.
My bot was banned from Quillbot this week which is a real setback as their model performs far better than Pegasus ever has for me.
Do you have any pointers for building my own summary model? Cheers
 

Niffo

Jr. VIP
Jr. VIP
Joined
Jul 7, 2019
Messages
173
Reaction score
124
My bot was banned from Quillbot this week which is a real setback as their model performs far better than Pegasus ever has for me.
Do you have any pointers for building my own summary model? Cheers
Same bro, quilbot is impossible to scale :(
 

Archemike

Regular Member
Joined
Jan 12, 2012
Messages
318
Reaction score
72
I'm using multiprocessing with Python. Very simple. The bottle neck is Google anyway. But also I can't imagine ever wanting so many articles or having so many clients that I would need more than an AMD Epyc 7451 24c/48t. This stuff can run 100-200 parallel Selenium processes.
have you considered non google targets for content
 

ComputerJunkie

Regular Member
Joined
Oct 9, 2012
Messages
331
Reaction score
105
G indexing API can index 200 pages/day

I mostly use aged domains with decent backlinks, so even without any indexing APIs it's not uncommon to have 10k pages indexed after 2 months.

I think it's less risk to have a website that doesn't have too many articles. Why keep articles that don't get impressions anyway. After 2-3 months it's quite clear looking at GSC.

1. I already replied twice to this. title is the main keywords with positive volume. h2s are related PAAs.
2. We're mostly using Yake for keywords + our own model for relevancy now.
3. I haven't. I will try it for another project. Looks interesting.

That's really cool. We're working on our own app that does it.

I get get your api quota in the xx,xxx per day region if you manage to scale your sites to a point where 200 a day isn’t cutting it anymore.

DM me if you’re interested.
 

Sartre

Jr. VIP
Jr. VIP
Joined
Apr 1, 2010
Messages
530
Reaction score
560
Website
NoSandbox.com
I get get your api quota in the xx,xxx per day region if you manage to scale your sites to a point where 200 a day isn’t cutting it anymore.

DM me if you’re interested.
It's not possible to DM you and selling outside of BST is forbidden. Can you just share your knowledge with others? :D
 
Top