Why Most SEO Case Studies are BS

Deleted member 969102 · Mar 13, 2017

Now I'm by no means really into SEO, but I do read about it literally every day, and I often see people spouting the BS that they see in case studies (I have to admit, I was guilty of this too when I first started reading them)

The main problem with SEO case studies, at least the two that I'm going to mention, later on, is that they are absolutely shit at statistical analysis. They can't distinguish the difference between correlation and causation.

I think it's much easier for me to use an example to explain what I mean, so, to begin with, we're going to be looking at Ahref's recent case study regarding page age and rankings:

What conclusion would you draw from this graph? Be honest, would it be that Google prefers older content? (I know it would)

While it may be true that Google prefers older content, this isn't actually what this graph shows, and the way that Ahrefs have presented it is actually very misleading. In Science, at least at the level that I study, we talk about how our results show a correlation between a variable and an outcome, but we don't have the knowledge, time, or equipment to prove that this outcome is caused by [Variable A]

In this case study, Ahrefs analysed 2,000,000 SERPS, noted down the age of the top ten results, plotted it in this fancy graph, and then strongly implied that Google prefers older content, when actually all this shows is that the top spots are older, not that all old content ranks better.

I know that some of you won't be able to tell the difference between the two scenarios, and you're the exact people who get take this shit for fact. That case study didn't take into account the fact that webmasters will obviously spend more time and money on SEO as time goes on, yet nowhere in the article did they acknowledge this.

I'm going to come back to my Science comparison for a minute. When we are testing something, we make a conscious effort to control every single variable that we can, so that we know it's actually [Variable A] that is causing the change, and not B, C, D, or E.

In cases that we can't control a variable, we measure that too and work it into our results. I guarantee you that if ahrefs, moz, or any other SEO blog that conducts these case studies measured other factors such as links, domain authority, traffic (Many, many more) their graph would look nothing like that.

But they wouldn't do that, would they? Where's the story in that?

Now, some may argue that they did mention that this isn't a perfect representation right at the end, but that's not what matters. I've seen the graph that I shared above, and others from the case study, taken out of context and shared around so much, that people just take it as fact.

This is another example of what I mean. Someone who sees that graph come up in an SEO thread is going to start believing that it's true, without questioning the case study and how they conducted it. Did they take into consideration that people who spend more money on SEO may also spend more money on content? If so, how do I know from the graph?

tl;dr - If you can't control a variable, measure it. The vast majority of case studies that I have seen draw attention to the fact that there is a correlation between something like Article length and average position, and then conclude that the outcome is caused by the variable.

To be clear, I am not arguing that these case studies are wrong in their conclusions (They are, but that's not the point here) I'm saying that the way they represent their data is wildly misleading and leads to the spread of misinformation and pure SEO lies. Take a look at this thread and then tell me I'm wrong.

What do you think? Do I make sense, or am I talking out of my ass?

asap1 · Mar 13, 2017

Agreed.

Gromit · Mar 13, 2017

I don't bother with them for the most part. I like ones that test shit like Solvemymaze's parasite case study but for the most part they're just clickbait trash.

"I often see people spouting the BS that they see in case studies"

This right here is the problem. Hamish Patel or whatever the fuck his name is says some clickbait bollocks on his site and everyone just laps it up without thinking or even bothering to question or test it.

Deleted member 969102 · Mar 14, 2017

Gromit said:
for the most part they're just clickbait trash.

Of course. If people started to question what these blogs were saying, they'd soon realise that they know as much about SEO as me (Possibly even less)

If you look at the ahrefs CS someone actually pointed out how it is already ranking pretty well for a SEO term and they're getting a ton of traffic.

xinga · Mar 14, 2017

So true! But you gotta realize that all these studies are nothing but a primitive link-scheme.

What also pisses me off like nothing else is the people at mozblog and other such venues. They are doing SEO for SEO and draw conclusions for the rest of the internet. How to run a facebook group or built a mailing list that converts. Most prominent examples are the backling magnets infographic and skyscaper articles.

seoworldin · Mar 14, 2017

This is only tool to manipulate to link their images or text with your website to get more rankings because Google evaluate image wide ratio and exposure as footprint and for text you all ready know.The best case study is that you create by self and get results-money

validseo · Mar 14, 2017

I think you are largely correct. I have even seen math errors in the SearchMetrics studies. And all of them destroy their information with epic sample sizes. And none of them seem to give a shit about significance or articulating how others can reproduce or independently verify their results.

I do however think you should consider some additional ideas:

Yes, correlation != causation, but most of the time causal things correlate so they are in the mix. So if you have time to invest in tuning 12 on-page factors out of the hundreds of possibilities then correlation all by itself can make your guesses MUCH better than they would be otherwise.
Sample Size is not always a good thing. The Search Metrics study blends a million different searches across industries, search intent, etc. It is the kind of information you would want if you only plan to rank for "average" terms across the whole web. But that isn't what we do. We always try to rank for "niche" terms. These studies would be infinitely more valuable if they just analyzed a single closely related niche (they still need to be statistically significant but often that can be achieve with as few as 100 samples). I can pretty much state that more than half (if not most) of there findings wont apply to your target keywords if you go into the field and take measurements.
Finally, any findings for one set of keywords will be TOTALY BOGUS for other keywords. This is because where you rank and what you need to rank is totally dependent on the degree of tuning & optimization used by your keyword's competitors. Different keywords = different competitors = different degrees of tuning. If you follow advice like "articles should have 500 words" then you might be WAY under or WAY over, but odds are very unlikely the advice will be spot on unless you go and measure your competitors.

Deleted member 1006940 · Mar 17, 2017

hasa1015 said:
They can't distinguish the difference between correlation and causation.

That's where the line is drawn.. thank you for stating it so concisely!

Ozzyzig · Mar 25, 2017

Gromit said:
I don't bother with them for the most part. I like ones that test shit like Solvemymaze's parasite case study but for the most part they're just clickbait trash.

"I often see people spouting the BS that they see in case studies"

This right here is the problem. Hamish Patel or whatever the fuck his name is says some clickbait bollocks on his site and everyone just laps it up without thinking or even bothering to question or test it.

Thanks for the kind words, means a lot.
In the situation with the case study I did, it was to see really if someone could get some good rankings using spammy shit like the old days. If done properly, you can. Not that I'm saying 301s are spammy, but the GSA and PBN posts are.
Patel is an absolute arsewipe though. His posts really make me want to stab my eyes out and eat them just so I don't need to see his fucking drivel again.

Ghost Hunter · Mar 25, 2017

SolveMyMaze said:
Thanks for the kind words, means a lot.
Patel is an absolute arsewipe though. His posts really make me want to stab my eyes out and eat them just so I don't need to see his fucking drivel again.

glad i am not the only who felt the same, his articles were written purely for generating money.

Maverick SEO · Mar 25, 2017

SEO guru bullshit fed through case studies, i dont bother with them

ThopHayt · Mar 25, 2017

If you want proof that people go crazy with correlation assumptions you need look no further than the advice that has been thrown around with regards to "Fred" lately. Nobody knows what "Fred" is, nor has it been confirmed... yet its name has become a buzzword around here.

-ThopHayt

Ozzyzig · Mar 25, 2017

ThopHayt said:
If you want proof that people go crazy with correlation assumptions you need look no further than the advice that has been thrown around with regards to "Fred" lately. Nobody knows what "Fred" is, nor has it been confirmed... yet its name has become a buzzword around here.

-ThopHayt

The only thing I know about Fred is that the fucker was married to Wilma.

sirmeep · Mar 29, 2017

Can someone point me to any explanation that details out the problems with excessively large sample sizes? Like I have a vague grasp on why this is a problem, but don't know what's the underlying math behind it.

Just read through this... http://stats.stackexchange.com/questions/125750/sample-size-too-large

validseo said:
Sample Size is not always a good thing. The Search Metrics study blends a million different searches across industries, search intent, etc. It is the kind of information you would want if you only plan to rank for "average" terms across the whole web. But that isn't what we do. We always try to rank for "niche" terms. These studies would be infinitely more valuable if they just analyzed a single closely related niche (they still need to be statistically significant but often that can be achieve with as few as 100 samples). I can pretty much state that more than half (if not most) of there findings wont apply to your target keywords if you go into the field and take measurements.

Like this makes sense, but how do you prevent specific nuances within a single niche to disportionately affect any analysis your running?

darulez · Mar 29, 2017

all depends on your personal needs.
but, for me. at least some a/b test should be done.
else, at least wait 4 ! weeks to see sth change or worsen. that is often not the case.
every fkn link or package I set to every subpage or root, I wait 4 weeks to see if sth changes rankings.

for on page, however 5-7 days are mostly enough.

ontopic:
they same shit happens too, if some guys compaire CTR of top10.
#1 will always have better CTR than #3.
so that is also some warriorforum crap advice with no causation behind it.
same for article length.

hardly the "xxx hack" guys care about 2500 words articles when they want to rank of 1 keyword... and they dont as most CPIs know.

Aty · Mar 29, 2017

ThopHayt said:
If you want proof that people go crazy with correlation assumptions you need look no further than the advice that has been thrown around with regards to "Fred" lately. Nobody knows what "Fred" is, nor has it been confirmed... yet its name has become a buzzword around here.

-ThopHayt

Wtf is Fred?

validseo · Mar 29, 2017

sirmeep said:
Can someone point me to any explanation that details out the problems with excessively large sample sizes? Like I have a vague grasp on why this is a problem, but don't know what's the underlying math behind it.

Just read through this... http://stats.stackexchange.com/questions/125750/sample-size-too-large

You have to consider relevance and diluting metrics. Most of us rank in niche specific keywords. If SearchMetrics looks for trends in a million searches like "Sandwiches, Hotels, Cars, Bank Accounts,..." the likelihood that the correlations will match the correlations for your specific term are very low especially if your term is longer tail than terms like "Hotels" and "Cars". Additionally by spanning so many types of searches they are crossing over different algorithms and blending the results together. For example mixing shopping searches like "buy red apples" with informational searches like "22nd US president"... the features displayed in the results sets for those search are different. This demonstrates that Google is algorithmically detecting those types of searches and is treating them differently. Blending them together and calling it "rules that apply to all" is a big mistake. You need to analyze like searches. For example you have to study men and women separately in many situations or you get a bad outcome. like in designing clothes, your average shopper has one testicle and one mammary. You wouldn't want to design your clothing line for that though. It is a ridiculous example but it is basically the same mistake that occurs in the SearchMetrics studies. It would be far more valuable for search metrics to do the same study with a much smaller sample of say "Common Dentist Search Terms". We would learn a lot more and there would be better odds of some of the findings being applicable to similar industries.

The final part is that they wear sample size like it is a badge of honor that matters more than anything else and makes the data indisputable. Well I dispute it. Its flawed. statistical significance matters more. The data they produced while "conceptually interesting" is practically useless in terms of helping you rank for your search terms.

Like this makes sense, but how do you prevent specific nuances within a single niche to disportionately affect any analysis your running?

The analysis of the search term you are trying to rank for IS MUCH more important than the analysis of the "average search term". Your fear of outliers is destroying your data if you solve it that way. As long as you are statistically significant then the more niche you can get the better the data will be to help you rank. Good science is about testing things discretely and controlling variables. Throwing everything into one big pot is kind of the opposite of that.

mickyfu · Mar 29, 2017

Most SEO case studies are bull shit because they are written by kids who rank for fuck all.

Reaver · Mar 29, 2017

ThopHayt said:
If you want proof that people go crazy with correlation assumptions you need look no further than the advice that has been thrown around with regards to "Fred" lately. Nobody knows what "Fred" is, nor has it been confirmed... yet its name has become a buzzword around here.

-ThopHayt

Why the hell is it called Fred?? That is seriously my biggest problem.

jazzc · Mar 29, 2017

validseo said:
Yes, correlation != causation, but most of the time causal things correlate so they are in the mix.

You are still confused about the correlation not equals causation thing.

You said:

* correlation != causation
* but causation -> correlation
therefore maybe correlation == causation after all.

Which is, like, no.

Simple thing. Correlation does not alone logically conclude causation. Never. Not even once. Period.

Why Most SEO Case Studies are BS

Deleted member 969102

Guest

BANNED

Regular Member

Deleted member 969102

Guest

Regular Member

Newbie

Senior Member

Deleted member 1006940

Guest

Elite Member

Newbie

Junior Member

Elite Member

Elite Member

Registered Member

Elite Member

Elite Member

Senior Member

Banned due to many warnings

Elite Member

Elite Member

Main Menu

Marketplace

Making Money

BlackHat World