1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

If is Google so good at detecting duplicated content then why..

Discussion in 'Black Hat SEO' started by shanna_doll, Jan 16, 2013.

  1. shanna_doll

    shanna_doll Power Member

    Joined:
    Apr 10, 2012
    Messages:
    653
    Likes Received:
    321
    Location:
    Bosnia and Herzegovina
    My site is ranked on 2nd or 3rd page, assuming that on the first place, except first 3 results, everything else is pretty much copied/duplicated content, aka auto blogs etc.

    How can I surpass those sites, with ease.. only time will tell or.. ? of course for some keywords I'm ranking really good,but for most my site is not even close where it should be.

    On a side note, I hardly build any backlinks. I do have a lot of content. On some sites I have 30 posts, on other I have 10-15 but writing more. My sites are 2 weeks to 3 months old. Haha. I know it's because of authority. But STILL, duplicated content should be away from the first page, whenever something unique on the topic comes in.
     
    Last edited: Jan 16, 2013
  2. futurestunner

    futurestunner BANNED BANNED

    Joined:
    Dec 26, 2009
    Messages:
    1,532
    Likes Received:
    1,036
    'm tired now. need some rest.
     
    • Thanks Thanks x 1
  3. steelballs

    steelballs BANNED BANNED

    Joined:
    Dec 5, 2008
    Messages:
    1,832
    Likes Received:
    4,562
    For a start adding some back links will act as an anchor to hold higher ranking positions...

    Good back links and adding quality new content on a regular basis are some basics that I always build on sites...

    But I do that because I believe that these boring WH methods will do for me, as I want sites to keep SE positions...

    There is an old saying use BH for quick ranking and instability...

    or

    Slog away for longer with WH wait and then when you get ranking you get stability...

    Both work but it all depends on what are your targets...

    In other words horses for courses just remember when to jump on the WH and/or BH horse.. you jump on the right horse

    Yes you can ride both the BH and WH horse at the same time but that requires much more skill to get stability :cool:
     
  4. Scritty

    Scritty Elite Member Premium Member

    Joined:
    May 1, 2010
    Messages:
    2,807
    Likes Received:
    4,496
    Occupation:
    Affiliate Marketer
    Location:
    UK
    Home Page:
    Google aren't very good at spotting duplicate content in some cases.
    They are better than they were - and getting better all the time.
    But that's still no where near as good as they like to pretend they are.

    Several trillion URL's on the net. Every one COULD have content duplicated from any of the others.
    Just sit back and think about the maths there for a minute...
    Check URL one against the other 2 trillion URL's.... Analyize Ngrams, links, structure, server speed.
    Check URL two against...
    Oh noes .. while they were doing that 4 million new URL's were added to the internet (it grows by tens of millions of new URL's every day)

    How much time and processing power do you think they are going to spend on examining the exact providence of some article about "patio heaters" with an Amazon ad stuck under it?
    Not very long and not very much is the answer.

    While they bullshit about deep analysis - in fact they are looking to shortcut everything. The best result for the smallest effort.
    They aren't there to "police" the internet. They are a commercial site themselves - just another site on the internet no more no less.
    They aren't even there to "create a better experience for users! (despite their whining publicity line that they are)
    They are there to make money for shareholders.
    As soon as the effort taken to improve the order of their listings costs more than it makes... they will not do it.
    That's business!


    [Why do people think Google is some Big Brother site with special "internet powers" ... or some altruistic website management team - they are neither...Google is just a website that lists other websites in an order that makes them money with advertising placements...that's the beginning middle and end of what the Google search engine is]

    So yeah - they are getting better. But not by "deep analysis" and "human scrutiny". That would be so far from cost effective it's untrue.
    In some very commercial niches there is human scrutiny I'm sure of that, but for most it's a quick whizz through with a crawler, some basic metrics measured, boxes ticked or crossed through, and a few numbers generated which sum up the result. All over the internet shit works, spam works, copied content works.
    Not as much as it used to, and it's a risk to be sure. Who knows what cute "P" animal update 2013 will bring... but they are so far from perfect - and never really likely to get that close.

    Scritty
     
    • Thanks Thanks x 3
  5. Rasacz

    Rasacz Power Member

    Joined:
    Oct 27, 2009
    Messages:
    510
    Likes Received:
    193
    Location:
    Poland
    Another great post from Scritty.
    He's completelly right - good isn't as good as it pretend to be. It must make money and if they are improving - that's only because of money.
    There are A LOT of sites taking first serps spots with only duplicate content - you can't disagree with that. IMO you just need some unique content mixed in and some quality backlinks - and it will work. Or - as you did - just duplicate content, good keywords and some luck.
     
  6. madoctopus

    madoctopus Supreme Member

    Joined:
    Apr 4, 2010
    Messages:
    1,249
    Likes Received:
    3,498
    Occupation:
    Full time IM
    Problem most content spammers have is not google is great but that they suck THAT bad at content generation that don't even know about algorithms developed 30 years ago. Then, they don't understand data and statistical analysis and how that can be used easily and cheaply to detect if a page is possible spam without even doing any sort of actual content analysis - for example there are research papers explaining in detail statistical footprints of spam content and they never talk about n-grams or advanced stuff just simple metrics like text-to-tag ratio, paragraph length, wordcount, etc. And they have 80%+ detection ratio for a lot of the spam that used to be generated and still is today.

    If i need to compute similarity between 2 articles I do it that way - i build a program to do n-gram based comparison and get a similar approach as copyscape and obtain a metric. If I need to measure uniqueness i get from a spintax for 1000 articles I have to do in excess of 500k comparisons each-with-each. Takes a lot of dedicated CPU time.

    Google doesn't have to do it like that. They use a reverse index. That means they only have to check if the database term mapping for a page passes a certain threshold with any other page in the system. That means the page is not really unique. They have to restrain it though because most content on the net is in fact NOT unique and there's not an evil or blackhat reason for that. Point is they can figure out similarity/uniqueness of a page very fast and very cost effective using a certain particular approach.

    Now, once you understand this approach, its architecture, you understand how you can beat it. Is not even complicated - is just a matter of making another algorithm that makes content not fit G's architecture. That way completely garbage content passes as unique enough and since google does not know nor will it know anytime soon what content means, you're gold.
    yes. they have marketing campaigns and brainwashing campaigns to keep SEOs worried and in fear and other to think they're the best. they may be but they are the ones in the disadvantage not the SEOs. they are the ones trying to be smart and protect the SERPs, the BH SEOs can just dump shit and see what sticks and figure stuff out. Plus, as you said, bothering with unimportant stuff costs them money.

    They are big brother for reasons other than SEO, some good reasons and some not so "do no evil" reasons. But that's more of a by-product of owning the data. They do have people and algos working on finding kiddie porn, gore, etc. both on G and youtube. They did and do work hard on keeping virus/scam spreading in check to a certain extent, etc. But as i said this has nothing to do with SEO.

    My point is even adapted Markov chains work if you know how to adapt them. Even 100% scraped content with no spinning works. Lots of stuff works, but people just do it waaay to lazy and dumb when they could put just a bit more thought into it and do better. Not to mention many overdo it becoming a blip on G's radar very fast. If you're a nin*ja infiltrating the emperor's castle dressed in black and caring weapons, then stay in the fking shadows. Otherwise, if you want to walk through the front gate then FFS dress as a peasant and carry a peasant tool that you know how to use as a weapon later. That reminds me of this (all is cool to watch, but skip to 49:18 for the nin*ja part):

     
    • Thanks Thanks x 7
    Last edited by a moderator: May 18, 2016
  7. Scritty

    Scritty Elite Member Premium Member

    Joined:
    May 1, 2010
    Messages:
    2,807
    Likes Received:
    4,496
    Occupation:
    Affiliate Marketer
    Location:
    UK
    Home Page:
    When Madoctopus talks content - we listen :)
     
    • Thanks Thanks x 2
  8. madoctopus

    madoctopus Supreme Member

    Joined:
    Apr 4, 2010
    Messages:
    1,249
    Likes Received:
    3,498
    Occupation:
    Full time IM
    Ah to answer the OPs post, because i said nothing related to his question...

    You can get traffic and make money with no links and poor content. Its all about choosing the keywords. Super long tail are many times great money makers. Maybe you only get like 5visits/month for such a keyword but if they make you money and are not targeted by competitors and you manage to target tens of thousands of them you make money. I had a site i built as a test and never thought would go to well make me $100/day from adsense with ZERO LINKS. Just because i targeted the right keywords and in high volume and they happened to be high CPC.

    Thing is google doesn't know much. They dont know X is a blackhat and Y is a whitehat and that X has spam content and Y has nice content. They have just some numbers. They take these numbers use an equation and come up with a spot where X and Y should rank. That simple. If the blackhat dudes you compete against have more factors to their advantage or they look better they outrank you easily even if you have a more useful site. Is that simple.

    "But STILL, duplicated content should be away from the first page, whenever something unique on the topic comes in."

    NOPE! Because as Scritty and I said in previous post, Google doesn't really know that. It just guesses based on numbers it analyzes.

    Bottom line is, don't try to beat competition in a BH niche with great content that costs a lot. Just make some content, then focus on links. Links are always the ingredient that will offset the equation to your advantage. Give me the shitiest site out there with unreadable content and give me the best links for it and i will outrank Wikipedia.
     
    • Thanks Thanks x 3