1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Is there a unique content "UNIT" for Google?

Discussion in 'Cloaking and Content Generators' started by biks, Dec 17, 2008.

  1. biks

    biks Power Member

    Joined:
    Oct 28, 2008
    Messages:
    713
    Likes Received:
    376
    Occupation:
    Video editor, graphics guy
    What's the smallest amount of text can I steal/spin that would be considered unique content? And does this bit of text need to be unique to the article or the internet in general? Follow me on this:

    I grab an article from Ezine*Articles.com and slap it onto a website. 100% NOT unique.

    I grab two articles about the same topic. (There's a TON of articles that are basically about the same thing on there.) I use the first half of article 1 and the last half of article 2 and make a new article. Would I now have something that is 50% unique to itself, or 100% NOT unique to the internet?

    Let's go nuts - I grab 1 sentence from 16 articles and make an article from that. (Instant Article Wizard does just this.) Google would've seen each sentence on the Internet, but not together in this order. This article is totally unique onto itself, but the parts aren't unique to the internet. What does Google think of such a creation?

    I go even further and change the middle word of each of the 16 sentences above. Each sentence is now unique, but the first and last word groups of themselves aren't. Is this now totally unique or does Google see that it's comprised of 4-5 word groups that aren't unique to the internet?

    At some point you can't break down the English language any further because certain short groups of words normally go together. How can you be penalized for that?
     
  2. Atomic21

    Atomic21 Junior Member

    Joined:
    May 8, 2008
    Messages:
    182
    Likes Received:
    35
    Location:
    USA
    Use c0pyscape.c0m. If it passes c0pyscape then it should be fine.
     
  3. biks

    biks Power Member

    Joined:
    Oct 28, 2008
    Messages:
    713
    Likes Received:
    376
    Occupation:
    Video editor, graphics guy
    Right, but I want to know WHY I'm going to be fine. Because you can jiggle the handle and candy falls out of the Google candy machine is very cool. I want to know what is causing the candy to fall out in the first place.
     
  4. drkenneth

    drkenneth Executive VIP

    Joined:
    Nov 13, 2008
    Messages:
    285
    Likes Received:
    176
    Occupation:
    Developer/Entrepreneur
    Location:
    USA
    It's hard to say for sure--Google does not exactly make that information public :p

    Probably depends on LOTS of factors. More traffic/backlinks = more scrutiny. Time of existence, current ranking, etc., could also have an effect. (If it's ranked 1000th, they won't look as carefully and spend as much resources analyzing your content as if you were 25th.)

    There is no real definite description to put on what is the correct amount of difference your pages should have. Basically go for what seems reasonable and if you start having issues create even more unique / split data.
     
  5. sidddd

    sidddd Power Member

    Joined:
    May 15, 2008
    Messages:
    749
    Likes Received:
    460
    I just glance thru top 4-5 articles and then create my own short article using my plain English language.. :) tat makes it unique n 100% Unique!!
     
  6. blackhaze

    blackhaze Power Member

    Joined:
    Jan 11, 2008
    Messages:
    661
    Likes Received:
    167
    Occupation:
    self made millionaire
    Location:
    in the matrix
    Home Page:
    No, and NO, and NO.

    CS has a very primitive algoritm which has NOTHING to do whether your content is seen as dupe in Google or not.

    Copyscape can give you an ok if a simple javascript obfuscates your text...however google is NOT that stupid.
     
  7. xbox360gurl70s

    xbox360gurl70s Elite Member

    Joined:
    Sep 28, 2008
    Messages:
    1,532
    Likes Received:
    349
    Location:
    In your wet dreams
    4 words in exact order as others out there is already dupe content.

    I don't believe too much on this though.
     
  8. drkenneth

    drkenneth Executive VIP

    Joined:
    Nov 13, 2008
    Messages:
    285
    Likes Received:
    176
    Occupation:
    Developer/Entrepreneur
    Location:
    USA
    Could actually be possible. For example, if Google determined your site a 'site of interest' i.e. growing rapidly in ranking or some other flag, it could do the following:
    1) Load a page
    2) Cut the page up into 4-word pieces
    3) Search for each of these 4-world pieces in its own database and see if there are any results
    4) Count number of incidences where the 4-word phrases exist in other locations
    5) If a certain number of overlaps are found (like 70 out of 100 4-word-splices occur somewhere else) then blacklist your page


    This could very well by similar to what they do. (Of course it would be more complicated and take more into account.) As their processing power increases and algorithms mature it will be harder and harder to hide.
     
  9. freller

    freller Regular Member

    Joined:
    Sep 26, 2008
    Messages:
    210
    Likes Received:
    65
    Dr Kenneth is nearest I think, but, at present, I don't know of ANY copy checker that will properly find dups online and I've used most of the 'commercial grade' educational dupe checkers that are used to find university thesis cheats.

    4 word blocks is commonly regarded as the maximum 'granularity' you need to go to because smaller than that and not only does the number of checks necessary rise dramatically, but so will the number of duplicates found. However IT DOESN'T MATTER. All that matters is that your whole page isn't an exact copy of something somewhere else, and that you don't repeat the same page elsewhere on your site.

    What all search engines are looking for, and will rank highly, are on-topic, well constructed quality sites. And that's something many BH'ers with their Markov spun garbage tend to fall down on. RSS, if it's not just a page of short snippets, is OK as are directory articles - think about it, how are article directories ranked for the same articles? If you really think about that for a moment the answers will become very clear.

    I've proved it to myself so many times now that it really doesn't need retesting yet again. It's the quality of what's in a site that counts, not whether it's a set of duplicate content from other places. One great site is worth way more than a zillion crap ones.