I was replying to someone via pm, and I ended up writing a ton of stuff about on-page.
I decided to create a thread to share it here, otherwise it's a bit of a waste of 20-25 minutes if it only goes to one person!
Here's the reply below. It's basically in response to someone who is confused about why google isn't showing better results and I went on to explain that Google is an ALGORITHM. It's not some magic AI. It's still pretty damn basic in 2020 when it comes right down to it.
Here it is
-----------------------------
It doesn't matter that it's created by humans. It's a very very hard problem to solve.
They're good at knowing intent, but that doesn't mean they're going to produce perfect results for everyone that searches. The problem is you're viewing google as being a lot smarter than it is, and that'll stop you from actually creating content that ranks. They are a lot more rudimentary than you think. You still need plenty of basic keywords, h2's, h3's, internal links with exact anchors in your content to rank. Try creating a super advanced, Ph.D level piece of content and you'll see that it will be beaten by a crappy article written by an indian writer who uses the right keywords, the right variations of those keywords and spammy h2's.
If you want to rank for "best laptop for nurses", you aren't going to do it writing some amazing article full of technical language. Google will just NOT understand your article unless you include this :-
nursing laptops, laptop for a nurse, best laptops for nurses, great laptops for nurses, what is the best laptop for a nursing, nurses need, nurses are, created for nurses, designed for nurses, this laptop for a nurse.
If you don't have enough of that in there, Google will not understand your article is great, and prefect for a nurse looking for a laptop.
What they do understand since hummingbird is this:-
best toaster is the same as what toaster is best, toaster reviews, the top toasters, top 10 toasters, top 5 toasters.
This is what we mean by "topical groups".
It doesn't mean it's going to understand the complex relationship between sentences and paragraphs. It isn't going to understand things like
"My neighbor had a big fat cat that loves to eat toast, which is why I helped her out by telling her about a review of the top 10 toasters"
"The best toasters around are created by engineers that were educated at top-tier schools like MIT, where they teach how to review toasters, and they help you pick out toasters that are suited for your own individual needs from lists of top 10 toasters"
"In this review we're going to look at the top 10 toasters available to buy today. There are multiple toaster manufacturers around, like Philips, Dualit and Russell Hobbs. A lot of these toaster brands have been around for decades, with a few of them being newcomers to the toasting industry. The basic toaster hasn't changed much in the past 30 years, but what has changed is the price point. You can find high quality cheap toasters for less than £50 online or on the high street."
It will have NO clue what those are about. It just looks at keywords. It'll think the first is more about toasters because you mention eat, toast, toasters, top 10, review, with something about neighbors and cats.
The second it will think is about the best toasters, teaching, education, toaster reviews, picking toasters and lists of the best toasters.
The third it will think is about toaster reviews, buying toasters, toaster manufacturers, toaster brands, toasting industry, basic toasters, prices, high quality toasters, cheap toasters, toasters less than £50, toasters available online, toasters on the high street.
Understand?
That's ALL it does when parsing text.
It has some basic ability to understand the salience of a word, and the subject/object, but not enough to gain any meaningful understanding of one article vs another. Not for a long time.
Them doing this imo is more a theoretical thing, or a "supplemental" thing, ie, to modify little bits and pieces of meaning here and there, but in general this is what happens :-
1) They get a big list of keywords. toaster reviews, picking toasters, high quality toasters.
2) They count them.
3) They weight them based on count and location. Ie, title, h1, h2, h3, bold.
4) They plug them into the rank brain to see how they all relate.
This means that if you have best toasters, but no toaster reviews, you can still rank for toaster reviews because it knows it's 90% the same thing. But that doesn't mean you don't need to mention toaster reviews, you help build a topical picture better if you mention tons of them.
What we don't do anymore is just spam "best toasters" 50 times. Or create separate pages for best toaster and toaster review like before humminbird. You still want to keep your keyword pairs < 2% density and trips < 1%. This isn't hard unless you've got a low quality article. There's no penalty for having dozens of variations of a keyword that the rank brain understands is part of the same topic, and there's no penalty for single keywords. At least, nothing in the realm of normality. Maybe if you say "toasters", 1000 times in a 2000 word article, yes, you'll get a penguin penalty. But no penalty for toaster reviews, this great toaster, you can buy toasters, cheap toasters, toasters are this, toasters are that, you can find toasters, this toaster here, that toaster over there. It's all just building up strong topical relevance.
And you find out what keywords to use in your article by analysing your competitors on page 1. The ones that are similar to you. Ie, if you're ecommerce, compare to ecommerce, if you're informational and have a weak site, compare to the weak informational ones on page 1.
Intent has nothing to do with this part. Intent is them trying to match keywords to pages based on the result of their topical analysis. For most keywords there's no magic special intent stuff going on. The magic in recent years is more with a small subset of queries, like "jaguar". Do you want the cat, or do you want the car? Do you want information on the car? Do you want videos of the car? etc People want all of them, so all of them appear on page 1.
They have a machine learning algorithm that modifies results based on live clicks. This helps them better understand user intent, but for your typical "What are the best toasters", there's no user intent problem.
I decided to create a thread to share it here, otherwise it's a bit of a waste of 20-25 minutes if it only goes to one person!
Here's the reply below. It's basically in response to someone who is confused about why google isn't showing better results and I went on to explain that Google is an ALGORITHM. It's not some magic AI. It's still pretty damn basic in 2020 when it comes right down to it.
Here it is
-----------------------------
Yeah, but it's created by humans to give the results we want.. and they are good knowing the intent nowadays.. if Im looking for the best treadmill It's not for informationl purposes, so I don't care what's the best treadmill in usa. I wanna buy it, and Im buying it here...
It doesn't matter that it's created by humans. It's a very very hard problem to solve.
They're good at knowing intent, but that doesn't mean they're going to produce perfect results for everyone that searches. The problem is you're viewing google as being a lot smarter than it is, and that'll stop you from actually creating content that ranks. They are a lot more rudimentary than you think. You still need plenty of basic keywords, h2's, h3's, internal links with exact anchors in your content to rank. Try creating a super advanced, Ph.D level piece of content and you'll see that it will be beaten by a crappy article written by an indian writer who uses the right keywords, the right variations of those keywords and spammy h2's.
If you want to rank for "best laptop for nurses", you aren't going to do it writing some amazing article full of technical language. Google will just NOT understand your article unless you include this :-
nursing laptops, laptop for a nurse, best laptops for nurses, great laptops for nurses, what is the best laptop for a nursing, nurses need, nurses are, created for nurses, designed for nurses, this laptop for a nurse.
If you don't have enough of that in there, Google will not understand your article is great, and prefect for a nurse looking for a laptop.
What they do understand since hummingbird is this:-
best toaster is the same as what toaster is best, toaster reviews, the top toasters, top 10 toasters, top 5 toasters.
This is what we mean by "topical groups".
It doesn't mean it's going to understand the complex relationship between sentences and paragraphs. It isn't going to understand things like
"My neighbor had a big fat cat that loves to eat toast, which is why I helped her out by telling her about a review of the top 10 toasters"
"The best toasters around are created by engineers that were educated at top-tier schools like MIT, where they teach how to review toasters, and they help you pick out toasters that are suited for your own individual needs from lists of top 10 toasters"
"In this review we're going to look at the top 10 toasters available to buy today. There are multiple toaster manufacturers around, like Philips, Dualit and Russell Hobbs. A lot of these toaster brands have been around for decades, with a few of them being newcomers to the toasting industry. The basic toaster hasn't changed much in the past 30 years, but what has changed is the price point. You can find high quality cheap toasters for less than £50 online or on the high street."
It will have NO clue what those are about. It just looks at keywords. It'll think the first is more about toasters because you mention eat, toast, toasters, top 10, review, with something about neighbors and cats.
The second it will think is about the best toasters, teaching, education, toaster reviews, picking toasters and lists of the best toasters.
The third it will think is about toaster reviews, buying toasters, toaster manufacturers, toaster brands, toasting industry, basic toasters, prices, high quality toasters, cheap toasters, toasters less than £50, toasters available online, toasters on the high street.
Understand?
That's ALL it does when parsing text.
It has some basic ability to understand the salience of a word, and the subject/object, but not enough to gain any meaningful understanding of one article vs another. Not for a long time.
Them doing this imo is more a theoretical thing, or a "supplemental" thing, ie, to modify little bits and pieces of meaning here and there, but in general this is what happens :-
1) They get a big list of keywords. toaster reviews, picking toasters, high quality toasters.
2) They count them.
3) They weight them based on count and location. Ie, title, h1, h2, h3, bold.
4) They plug them into the rank brain to see how they all relate.
This means that if you have best toasters, but no toaster reviews, you can still rank for toaster reviews because it knows it's 90% the same thing. But that doesn't mean you don't need to mention toaster reviews, you help build a topical picture better if you mention tons of them.
What we don't do anymore is just spam "best toasters" 50 times. Or create separate pages for best toaster and toaster review like before humminbird. You still want to keep your keyword pairs < 2% density and trips < 1%. This isn't hard unless you've got a low quality article. There's no penalty for having dozens of variations of a keyword that the rank brain understands is part of the same topic, and there's no penalty for single keywords. At least, nothing in the realm of normality. Maybe if you say "toasters", 1000 times in a 2000 word article, yes, you'll get a penguin penalty. But no penalty for toaster reviews, this great toaster, you can buy toasters, cheap toasters, toasters are this, toasters are that, you can find toasters, this toaster here, that toaster over there. It's all just building up strong topical relevance.
And you find out what keywords to use in your article by analysing your competitors on page 1. The ones that are similar to you. Ie, if you're ecommerce, compare to ecommerce, if you're informational and have a weak site, compare to the weak informational ones on page 1.
Intent has nothing to do with this part. Intent is them trying to match keywords to pages based on the result of their topical analysis. For most keywords there's no magic special intent stuff going on. The magic in recent years is more with a small subset of queries, like "jaguar". Do you want the cat, or do you want the car? Do you want information on the car? Do you want videos of the car? etc People want all of them, so all of them appear on page 1.
They have a machine learning algorithm that modifies results based on live clicks. This helps them better understand user intent, but for your typical "What are the best toasters", there's no user intent problem.