1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

What is today's SEO - full insight, NOT

Discussion in 'White Hat SEO' started by madoctopus, Mar 29, 2011.

  1. madoctopus

    madoctopus Supreme Member

    Joined:
    Apr 4, 2010
    Messages:
    1,249
    Likes Received:
    3,498
    Occupation:
    Full time IM
    The driving element behind this thread is another thread that you can read here:

    http://www.blackhatworld.com/blackhat-seo/white-hat-seo/289589-what-todays-search-engine-optimization-full-insight.html


    Long story short, it is too full of crap. Worse than that, it is written in a way that makes it look like it's some kind of tasteful dish made by a french chef. That last part is what kinda annoys me. I'm ok with freedom of speach and this is a forum where everybody's input is welcome, but making certain statements and tagging them as "correct" or "accurate" should happen only when you know what the fuck you're talking about.

    First of all, I will pick up on statements ExtraWinner made and explain them myself as well and point why I think otherwise. Very little of what he explained is correct or even close to reality. Some things are correct, but very few, and the overall message he transmits is not.


    Spiders

    Google spiders - 4 main groups: backlinks spiders, content spiders, picture spider, video spiders. NO!

    There is ONE single spider. ONE not many. The spider does one thing and one thing only. It discovers links and passes the page content to the indexing system. It started in the begining with a small number of sites that had many outbound links and from there discovered the entire web. The spider runs as a distributes system, obviously, but at a logical/software layer it is/behaves centralized. What happenes is the spider looks up the "database" of known pages (URLs) and visits those URLs. It builds a list with all URLs on the page and for every URL that wasn't in the "database" it saves it. Also, since it already consumed bandwidth requesting the page, it passes that page (the actual HTML and possibly other details) to the indexing system, so that there won't be need for a new retrieval. That's it! The spider's job is not to do any sort of analysys or "thinking" just to spider. The way it does this is not important, but there are a few things (useless) you might like to know:


    • It does thousends of requests at the same time.
    • Though it can do thousends of requests at the same time, it has to hit the web servers it crawls slowly so it doesn't produce a DDoS effect on them. That is why it has a subsystem which has to decide what URLs have to be crawled next but in such way that it doesn't crawl many URLs on the same server at the same time.
    • It has to avoid loops. A single PHP script can generate an infinity of actual pages by using URL parameters. A subsystem of the spider has to make sure that the spider doesn't get trapped in such an "infinite" loop. Same system decides wether a URL should be crawled or not based on several factors. Won't go into details here but most importanly, it will not index too much duplicate content or URLs that are too deep within a subtree (which generally translates into "have low PageRank").

    Just to make it clear, by PageRank, in this particular instance, I am not referring to the 0-10 value but to the real internal value and especially the page rank computed from the incoming graph edges.

    The indexing system does not really try to understand the text. It just saves it in something that is very likely similar to a reverse index. Because of the architecture of the reverse index they can figure out what pages are good candidates for a search query. Also because of the architecture of the reverse index it is somewhat easy to figure out poorly spun or copied content. There are some additional layers there, but the foundation itself is in the architecture of the reverse index. Google has been capable to detect duplicate content since a long time ago, it just didn't used that as a ranking or penalty factor, instead relying mostly on links. Even today with the Panda/Scraper update it still can't do this perfectly, though they say they got pretty good at it.

    For more information look on wikipedia the articles about web crawlers, search on Google videos for the presentations done by Google engineers about the architecture of their system and read the research papers on this topic.


    Google dance

    I have no idea why it happens exactly but I suspect it is a side effect of the computational process that has to do with figuring out what score that URL should get for the particular keyword. Very likely not all factors get computed at the same time. What I am pretty sure about is that it is NOT the result of many spiders that each give scores. Why do I think that? Because it would not be optimal design to have more than one spider or have separated components give scores (+ or -) by themselves. How I say it works is an educated guess only but I am pretty sure that it is not the result of multiple spiders each giving different scores.


    Sandbox

    The explanation though nice to understand and maybe seems to make sense, again is not correct. I have no idea how a penalty it figured out or computed internally but it is not a number resulted from adding + and - scores.

    For example there is no such thing as a content penalty. It's just that your content isn't a good enough candidate for the searched keyword. It is a matter of ranking content based on factors (e.g. relevance) and the content that doesn't rank simply didn't made it.

    Most penalties however are at link level. They can only be computed by analyzing the link graph (coupled with some other relevant info) and detecting certain "features" that the link graph for a site has. This is to a large extent based on probability/statistics and also takes into consideration patterns (e.g. 90% of links are 301 redirects). It is possible to use machine learning algorithms to detect patterns, but I wouldn't be sure that this is how it's done.

    I won't go into details here but I can tell you that when you have the entire link graph (like google has) it is very easy to spot link networks or link schemes. Add to that extra information that Google has (e.g. host, registry), and it becomes a very easy job. You can read more about this from research papers. If you do that you will also realize how link schemes many of you come up with (at least many I have seen around here) can easily be detected and are unatural. What you have to keep in mind is that google has a pretty permissive threshold so it doesn't accidentally hit clean sites. Obviously, that still happens but in Google's eyes it is acceptable collateral damage.

    Most important thing everybody should keep in mind is that though Google COULD do something it doesn't mean it actually does it. Google can do a lot of things and can use many signals to figure out stuff, however sometimes doing that would be too computationally intensive or would result in penalizing too many legit sites.

    Also, the penalizing process happens at various levels and a penalty could be eliminated if a subsystem computes that some other factors are more important. For example, if you blast very fast a ton of comments and forum posts to a new site it gets penalized. Do the same to an established site and nothing happens, or you may even help it. Also, statistically speaking if there are 9 out of 10 signals that scream "blackhat" then you can guess what happens. If it's 1 signal out of 10 nothing happens because the statistical confidence is low to make a decision.


    Authority


    Link graph data determines domain and page authority. That in turn coupled with the content score for that query determines position in the SERP. That is an overly simplistic view at the problem but is that part that matters. No matter how well you optimize anchor text and content for a keyword, a domain that is trusted will outrank you even though that keyword only appears once on it's page. I have a site that ranks for "keyword 2011" (or whatever year we are in) just because i have "Copyright (c) 2010-2011. All rights reserved." in the footer. If I have 20 ******** links with anchor "read more" on pages about winkels, dinkels, splocks, shmoks, etc. on top trusted domains/pages, then on my page I have "Copyright 2010-2011" and a user searches for "keyword 2011" I will rank for that #1, before your site that got blasted with 50,000 links with ScrapeBox and Xrumer, with several variations of "keyword". Why? Because if a stranger on the street will say he will take care of your 5 months old baby you will beat the crap out of him. But if your friend from childhood will say that, you know you can trust him. Then again, if me a SEO guy get dressed in a white coat and come in your reserve at the hospital and say "I have bad news. We did this test and you have colon cancer." you will believe me. Just because I have a white coat. Blackhat is about putting on a white coat not about yelling louder "I am a doctor". Again, you can read more about this from research papers.

    Is PageRank important?

    Both are correct:


    • There is no PageRank
    • PageRank is the fundamental measure to rank a site

    What do I mean? There is a "rank" in the algorithm. That is what matters. It is derrived from many "ranks" combined using a complex algorythm (not simple addition but complex rules). It is not just a matter of adding 2 average links + 3 average links + 4 super weak links and getting the result 1. Also, that "2+3+4=1" that ExtraWinner presented is really misleading. I can look at it from multiple perspectives and it can mean for me many things. Same with the people that read it. Some only read it in one way. But Joe's way is not Mike's way. Also, how Google computes real PR is not even close to how ExtraWinner explained it. Not even close. It isn't even the same as described in the original PageRank paper, but I would guess that was the foundation for today's PageRank. The toolbar PageRank is not very important though. While it gives a good estimate in most cases, you can easily get to rank #1 for a solid keyword with a PR0 page.


    N0F0llow?

    Worse than useless. Worse? Yes.

    How is that? Well, Imagine that every time you have an orgasm somebody cuts a finger from your hand, then from your feet, then your ears, then your nose... now nobody wants to have sex with you anymore because you look like a monster from a bad zombie movie. WTF am I talking about? N0f0llow is worse than useless because it is dangerous. It is easy to build n0f0llow links (e.g. ScrapeBox), you build many links (=spam), you shit on somebody's site to do that, you get the sensation you're doing something because of the large numbers of links that you build, everybody tells you it's good, everybody does it. Result? #1 ranking in google? Maybe for some weak keyword. Just your site being flagged in Akismet as a spam source and a bunch of other nasty "penalties". If 10000 homeless dudes smiling, out of the blue vouch for this dude I don't know that he is a good babysitter, do I trust him with my baby? Not really.

    So you see, n0f0llow is not evil or good. No. By itself n0f0llow is a tool. Noobs make it worse than useful because how they use it. I wouldn't mind a n0f0llow link from Wikipedia but I would hate to have my site blasted to 50000 blogs with a generic comment and same keyword. Why? Because my site is too good for that. My site gets "real" links. My site is too good to look at your site. My site won't play with your site.

    Oh, btw, if you want a technical answer or results from tests of why n0f0llow doesn't help, I could give you that as well, but I won't. I just want to say that nothing is absolute. Not even n0f0llow. It is not good or bad. It CAN be good OR bad, good AND bad or neither.


    What is hot these days?

    Same things that was yesterday and will be tomorrow:


    • Offering value!
    • Presenting your shit in a nice package so that the value can easily be seen!

    What i recommend:

    • Multi-site link exchange from paragraphs of text in the post.
    • Your own PROPERLY BUILT link and content networks.
    • Links from high authority sites.
    • Guest posts.
    • Contests / prizes.
    • Game changers - radical stuff that will give you an edge. I have some. Do you, or are you just following the pack?

    What is hot as in popular:

    • Some sort of crap that doesn't work too great if you have real competition, but is noob friendly.

    ExtraWinner recommends to submit to just one (or maybe top) article directories. I recommend to blast content everywhere. If you can post some content with a link on even the shittiest sites on the internet, do it. Why? Because domain, IP and C-class diversity is one of the main factors that improve the trust rank.

    Then again, content is expensive. Or is it? Spin properly and in a year, with 4h of work a day for spinning, you could produce 150,000 articles. I'm not saying you will actually be able to do that. I am not even saying you know how to spin properly. I'm saying it is doable. Now, the really big problem is actualy finding 150,000 different hosts to post that content on in order to maximize the effect of the links.


    Is Google smart?

    Yeah, both the people and the software. After all, the software manages to beat some of you, doesn't it?

    ExtraWinner, google guys may be on BHW to check latest noob trends. They won't get pissed though, you can be sure of that. I am pretty sure figuring out how to cope with the really blackhat and intrusive techniques that the organized crime cartels use, is higher on their priority list than what script kiddies do. By the way I am not implying all BHW members are noobs. By my measure most are, but there are also some that could easily make me look like a noob. Too few and very silent, as they should be.


    Perfect website?

    ExtraWinner said "having around 10 links on each page. External links to authority sites that are anyhow under same category.". Ever heard of the term "SEO black hole"? YouTube is a "SEO black hole". All big sites are "SEO black holes". Linking out to authority sites makes those sites even bigger authorities. Instead do your best to get some links from those sites, so that YOU become the authority.

    Also, did you pull those suggestions from your ass man? You did. You may think you didn't but you did. How do I know. There is no such thing as "perfect website". Having certain proportions, certain percentages for types of links, etc. WTF? You might have some sites that have those characteristics and are doing great, but that doesn't mean it is a generic "rule" or even an "in most cases" thing. A while ago some folks like you came up with a magic number for keyword weight on page at 4% or so. Then every other "expert" out there made it 3%, 5%, 10%, whatever, just so his number is "unique" lol. Not even somebody from Google would be able to say what the "perfect site" is. There is no such thing. Perfect site is a site with 1 million pages of unique content, 5 million links including about 50,000 links from top 10000 sites on the internet. I mean, wtf man, are you that much of an expert that you know how perfection looks? Do you even know how to measure content uniqueness? Do you know how Google measures it? There are many ways to do it. Which one should I use to make sure I have 80% unique content? Some big citations? Why not instead have other big ass sites quote my site and put a link in there towards my site?


    Accuracy of info?

    SEO now is not what it was 5 years ago. I know a guy who does SEO since 10 years ago and he sucks at it. I mean don't get me wrong, he is succesfull, he makes a lot of money, etc. but not because he's a great SEO but because he's a great business man. Also, my personal oppinion is that the information you presented is not only inaccurate but also misleading at a subtle level. Very misleading. Also incorrect, which is a different thing than accurate. There's one thing to miss a rabbit by 1 inch and there's another thing to not know there was no rabbit there in the first place.

    ExtraWinner, I see that you replied to some that didn't "liked" your post that you wanted to keep it simple and easy to understand. That is not an excuse for thin, misleading, incorrect information. The "noobs" you wanted to help apparently, don't get smarter from incomplete and incorrect information. In the case of SEO, incomplete can easily translate into "very wrong/incorrect" and on the long term affect your approach to SEO in a very bad way. While what you said might seem plausible, possible, might seem it makes sense, etc. the complete message you transmitted is wrong. The entire substrate message and in many cases factual data is incorrect.

    Also, in SEO you can't do accurate tests. As a matter of fact you probably get more accurate guestimates by just thinking like a SE engineer, throw at yourself the problems he would have and try to find solutions. If your a good coder with a solid foundation in critical thinking you might just pull it off.

    I recommend you to find the research papers on the topic you are interested in before you think you know how things work. After you will recover from the amazement, remember that Google implements it more effectivelly than it was outlined in that research paper.

    Anyway, this is my view on the "full insight" you offered. My thread doesn't have much insight. As a matter of fact it is quite "thin" and incomplete. I would say in the "SEO school" it is at kindergarden level.

    In the end, you choose in what you believe, regardless of the truth. It is best to research yourself, but most people prefer the eazyness of believing instead of the struggle of researching.

    ExtraWinner, I'm not trying to attack you as I don't know you and have nothing against you. However, I wanted to make some things clear.
     
    • Thanks Thanks x 33
  2. ScrapeBoss

    ScrapeBoss Elite Member Premium Member

    Joined:
    Nov 25, 2010
    Messages:
    1,865
    Likes Received:
    669
    Location:
    123.456.789.012.345.678.901.234.567
    Home Page:
    Thanks for this post. It's good we have views of other people on issues like this.

    I will take time to really read it and compare with the message of ExtraWinner.
     
  3. Subsonic

    Subsonic Regular Member

    Joined:
    Mar 17, 2011
    Messages:
    367
    Likes Received:
    333
    Location:
    DNS root zone database
    lol, I'd love to have some videobots visit my site :D As we all know (at least should know), it's not really possible for a web spider to analyze a content of a video so there's absolutely no need for a separate videospider.. Only thing the spider could do would be reading some description tags or something but that's what ONE AND ONLY G spider is for :)
     
    Last edited: Mar 29, 2011
  4. DoHNuT555

    DoHNuT555 Newbie

    Joined:
    Feb 1, 2011
    Messages:
    45
    Likes Received:
    5
    Good post, probably only one I have read all the way through
     
  5. Simas9

    Simas9 Jr. VIP Jr. VIP Premium Member

    Joined:
    Sep 28, 2010
    Messages:
    1,869
    Likes Received:
    825
    Location:
    Vienna, Austria
    Thanks. One noob question if you don`t mind: should external links on my web which are linking to authority websites be n0-f0llow or maybe I should get rid of them?
     
  6. alex1

    alex1 Junior Member

    Joined:
    May 23, 2009
    Messages:
    123
    Likes Received:
    110
    Occupation:
    Software Developer
    Location:
    Toronto, Canada
    What really got me was the fact that he tried to sound like an AUTHORITY on the subject, while barely knowing the basics.

    Here is just one example - look at the Page 2 of that thread, Post #74 where he has been asked this question:



    And this is his reply:


    This is simply not true. The correct answer is that image bot clearly identifies itself by using different User Agent string from the regular google bot. If you have seen a server log ONCE, you would know that. Looks like Extrawinner has never paid attention to such boring details, yet he feels qualified to speak as an authority on a complex technical subject.
     
  7. bz

    bz Hammerzeit Staff Member Premium Member

    Joined:
    Jun 10, 2010
    Messages:
    514
    Likes Received:
    3,083
    Occupation:
    Fixing everyone elses problems.
    Home Page:
    Why didn't you add this to his post as an inline one stop shop of discussion on the same topic?
     
    • Thanks Thanks x 1
  8. tsmihai

    tsmihai Newbie

    Joined:
    Nov 1, 2010
    Messages:
    31
    Likes Received:
    13
    Home Page:
    Very interesting post! I think today a good content and links rules!
     
  9. virtualc08

    virtualc08 Supreme Member

    Joined:
    Mar 23, 2010
    Messages:
    1,379
    Likes Received:
    951
    The best thing about both the threads i that it is sparked an awesome argument where you get to learn a lot of things and come to your own conclusion. Bring it on guys :D
     
  10. wokaka

    wokaka Senior Member

    Joined:
    Apr 1, 2010
    Messages:
    866
    Likes Received:
    230
    yeah why making 2 threads? Why OP didnt just reply there?
     
  11. madoctopus

    madoctopus Supreme Member

    Joined:
    Apr 4, 2010
    Messages:
    1,249
    Likes Received:
    3,498
    Occupation:
    Full time IM
    @M.A.D. Thanks for your input mate, but I have a different opinion on a few things:

    Most big sites (1mil+ pages) are SEO black holes. They do link out, obviously, but considering the huge number of pages they have, they link out very little. Look at YouTube, Amazon, etc. They are architected so most internal PR stays internal.

    You can get a site penalized. I see a lot of people say you can't but in fact you can. Its just not so obvious how to do it. I got my own site penalized for building shit links too fast, then I researched on this topic and tested on a few other fresh sites of mine which 4 of 5 I managed to penalize, then tried it on 3 sites that are not mine and 2 i got penalized and one i increased it's rankings from about #50 to #15-20. They did not recover by themselves within 5 months, then I stopped checking their stats.

    I will only say it is not just one factor (e.g. link velocity) that gets you penalized but several (at least) and they have to be combined in a certain way and within a certain period of time. Some of the factors are:

    • anchor text variation
    • link velocity
    • link source (type of site/page)
    • link location (text, footer, header)
    • patterns of source sites
    Combining 4 of these factors in a certain way WILL get an average or weak site penalized if the links you build overweight the links it had.
     
  12. madoctopus

    madoctopus Supreme Member

    Joined:
    Apr 4, 2010
    Messages:
    1,249
    Likes Received:
    3,498
    Occupation:
    Full time IM
    Because very few people read all the way to page 3 on a thread.
     
  13. madoctopus

    madoctopus Supreme Member

    Joined:
    Apr 4, 2010
    Messages:
    1,249
    Likes Received:
    3,498
    Occupation:
    Full time IM
    Why do you link to those sites? Is the link useful to the visitor? Is your site a spammy minisite? If you do it to avoid footprints in a link network then it has to be d0f0llow. If you link out because you want to give the visitor a recommendation for a good resource, it's your choice whether you make it DF or NF. If you link to a clean, useful site you can make it DF. Personally I don't think it helps you or hurts you either way as long as that site is clean.
     
  14. wokaka

    wokaka Senior Member

    Joined:
    Apr 1, 2010
    Messages:
    866
    Likes Received:
    230
    hi mad, what's your opinion on google +1 button?
     
  15. NapsteR

    NapsteR Jr. VIP Jr. VIP

    Joined:
    Mar 2, 2011
    Messages:
    2,779
    Likes Received:
    2,374
    Occupation:
    Full Time IMer
    Location:
    http://www.seophd.com
  16. mazgalici

    mazgalici Supreme Member

    Joined:
    Jan 2, 2009
    Messages:
    1,489
    Likes Received:
    881
    Home Page:
    I work with webspiders and I agree with you on your description on Google's spiders.
     
  17. madoctopus

    madoctopus Supreme Member

    Joined:
    Apr 4, 2010
    Messages:
    1,249
    Likes Received:
    3,498
    Occupation:
    Full time IM
    My opinion is very similar to what M.A.D. said here:

    http://www.blackhatworld.com/blackh...237-google-begging-exploited.html#post2646289

    Other than that, links are going to be the most important factor for a long time, so whatever new factors they come up with will be pretty much as a small addon factor. Kinda like page load time is now, maybe having a greater weight or a smaller weight.
     
  18. imserious

    imserious Senior Member

    Joined:
    Mar 27, 2009
    Messages:
    946
    Likes Received:
    560
    Hey madoctopus,

    Thanks for responding to EW's thread. Well, it does add to the discussion, but do you have empirical evidence for what you are saying is correct.

    What you have written seems a summary of the research papers available online.

    Your recommendation for ranking is pure whitehat and what everybody already knows. The real information is how to use BH tactics to emulate what G wants.

    i do not have any technical knowledge to add to the discussion, but what EW has said in his thread seems to be empirically truer and more beneficial than the generaliased/summarised version of research papers you have put here.


    • Everyone knows, content is king in the long term.
    • Who does not want authority links
    • Guest post costs both time and money

    So, whatever you have said is already there in the download section in innumerable books on SEO. Nothing new, sadly.
     
  19. madoctopus

    madoctopus Supreme Member

    Joined:
    Apr 4, 2010
    Messages:
    1,249
    Likes Received:
    3,498
    Occupation:
    Full time IM
    @imserious: it wasn't supposed to be a "full insight" post, mate. Only wanted to point out that a lot of the stuff in EW's post was incorrect. What I have written is not a summary of research papers. My recommendations are WH, the thread is in the WH section and they are not supposed to be some novel recommendations or some new super advanced stuff. If I would explain the super advanced stuff that I personally do, nobody here would understand a thing. Truth is, for a long term online business that is based on SEO, best things are usual WH things, you just have to find a way to do them in an efficient manner. Yes, BH tricks have their place too, but I wouldn't base my online business on "tricks".
     
  20. wokaka

    wokaka Senior Member

    Joined:
    Apr 1, 2010
    Messages:
    866
    Likes Received:
    230
    well in that case maybe you can tell us how to create a long term content and promote it so we can get authority links...

    to be honest im tired when everyone just saying good content. Back in 2010 i had a very serious investment in this white hat site. I had good content, and yeah i could got a lot of links for that site too (it went PR6 quickly). However, i did spend a lot of money just for my "good content" to be picked up by the authority sites and pay some link-bait specialists to promote it. Without budget i highly doubt anyone can have his "good content" to be recognized. Not anymore. Problem is, it might not worth the money because your content might only rank for keywords that most people won't even bother to click the ads/buy the products.

    You can be an authority site, yeah, but you definitely need way higher budget than doing black hat tricks

    of course i can add quality post at times but making it "recognized" by the authority sites has never been easy. Won't lie to you, i dont even know how to do that without decent budget

    and i agree, guest post usually is expensive. Most high quality sites won't give you a guest post with your link there for free. i remember around one year ago someone offered me to promote my content to techcrunch and one another very popular blog. He said i would need to pay techcrunch around $2000 or something for just one guest post. You can aim for smaller sites but they still would need you to pay hundreds of dollar each (and you still have to make a quality post as well for that)

    for example, let's say i want to make a travel site. I would need to create a video about me when i travel to sweden for instance. It would make it go viral instead if i just make a rehashed post about "what to see in sweden" you know...however, submitting it to youtube and the likes won't really give me visitors unless if I do pretty crazy thing in the video/write something very different in my post. otherwise people would see it as something too common and therefore nobody will give me authority links for that.

    I still need to pay for youtube views, guest post, etc. before it can go viral. So to be honest, making just a good long term content won't really help unless you have decent budget. I agree with you about having a "game changer" content. But making your game changer recognized at times also need budget (depends on the niche). And without that game changer, definitely you won't be able to do much before "bribing" these high authority sites or paying a lot for famous link bait services (i.e. digg user can bring your link to the top but for hundreds of dollar or someone with real followers in twitter can tweet your service for certain fee).

    and even contest, you need a lot of loyal traffic before you can make one. Otherwise you need another budget to promote your contest.

    I would like to know how you can get authority link/high quality guest post/link from contest and prizes with small budget. Will be much appreciated

    oh and also, we are here for business, you know that. I would like to know how to monetize white hat type of sites...Having a super quality website usually would 'reduce' your chance in making money. I know we can sell our own services but most people find it hard to make a product (unless you tell them to make rehashed ebook/video). Adsense and similar ad services? That one is crap. I hardly make more than $800 a month with adsense and even way less with adbrite while i can easily make 5-10 times more with CPA websites. Talking about CPA, obviously you wouldnt want to decrease your site quality by promoting CPA stuffs, i guess. So how do you make money with your white hat sites if not selling your own product?
     
    • Thanks Thanks x 1
    Last edited: Apr 3, 2011