1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

SEO - How Strict is Search on Duplicate Content?

Discussion in 'Black Hat SEO' started by The Doctor, Apr 11, 2017.

  1. The Doctor

    The Doctor Jr. VIP Jr. VIP

    Joined:
    Dec 18, 2010
    Messages:
    892
    Likes Received:
    261
    Occupation:
    Computer Scientist, Engineer, Programmer.
    Location:
    ☆☆☆☆☆☆
    Home Page:
    I was writing code for a proof of concept for a new content tool and checking the content I generated against Copyscape. What I found is that it seems as though people are doing something similar to what I was coding does. Most of the articles I generated listed Copyscrape results of articles matching around 50% of the content.

    This tells me that some of the most popular news sites in the world are most likely using an automated method of generating new articles before altering them a bit. There's no other way I could be getting these results. For instance, this article I just generated shows 7 Copyscape matches, each around 50% likeness. How much similarity is too much for the content to be truly useful? Is it more like a gradient or is there a cutoff point?
     
  2. drogon

    drogon Elite Member Premium Member

    Joined:
    May 28, 2010
    Messages:
    2,239
    Likes Received:
    1,111
    100% unique!
     
  3. The Doctor

    The Doctor Jr. VIP Jr. VIP

    Joined:
    Dec 18, 2010
    Messages:
    892
    Likes Received:
    261
    Occupation:
    Computer Scientist, Engineer, Programmer.
    Location:
    ☆☆☆☆☆☆
    Home Page:
    Obviously that's ideal but that can't be a requirement when I see 7+ major sites with 50+% similarity per article I search.
     
  4. Cshark

    Cshark Jr. VIP Jr. VIP Premium Member

    Joined:
    Feb 25, 2011
    Messages:
    1,362
    Likes Received:
    184
    Gender:
    Male
    Occupation:
    Grinding
    Location:
    NYC
    The answer can be easily traced from your post. News cannot be recreated and quotes cannot be re-made so that explains why news sites don't and cannot have 100% unique content. And that's okay with search engines so long it's not 100% plagiarized.
     
  5. drogon

    drogon Elite Member Premium Member

    Joined:
    May 28, 2010
    Messages:
    2,239
    Likes Received:
    1,111
    Thats the requirement for almost all niches with i guess few exceptions as you noted. But the news sites are huge brands and i doubt rely much on seo for their visitors. Its like, there is going to be alot of duplicate content on Facebook added by users but FB is not exactly relying on SEO! Same could be said of BHW. Now, your average 20 page or 100 page website get hit hard if using duplicate content
     
  6. darulez

    darulez Jr. VIP Jr. VIP

    Joined:
    Mar 12, 2013
    Messages:
    2,280
    Likes Received:
    717
    Gender:
    Female
    Occupation:
    Waiting 36 days till I can stick it in
    Location:
    Walhalla
    ist just wont index, or get de-indexed quite fast.

    I once did a test to see if that DC thing is just pure mozgarbleseogurublogbullshit or true.
    found out is is really there:
    https://gyazo.com/

    > made the content an image.
    > created a 2 paragraphs unique content.
    done.

    this was just a very small case wit low comp. but enough for me.

    DC problems occur in as canabalisation.
    it can either get de-indexed, or some "other content", could rank. If I search, I can find some screenshots about a static /sitemap/ ranking somwhere, but the content not.
     
    Last edited by a moderator: Apr 19, 2017
  7. SEO

    SEO Jr. VIP Jr. VIP

    Joined:
    Jan 6, 2017
    Messages:
    766
    Likes Received:
    568
    One thing to keep in mind is that Google understands that news is syndicated. The major news sites also use algorithmically generated content for things such as stock market swings, major stocks going up or down or earthquakes throughout the world. Much of that content is template driven, so you'll see a lot of DC matching in sites like Copyscape. This part is just a guess, but if I were Google, I wouldn't trash trusted sites for duplicate content, but I certainly would trash small blogs and other low authority sources. I'd post links to more sources about what I'm talking about, but I don't have enough posts yet. You can google for algorithmically generated content though.
     
  8. terrycody

    terrycody Supreme Member

    Joined:
    Sep 29, 2012
    Messages:
    1,415
    Likes Received:
    385
    Occupation:
    marketer
    Location:
    Hell
    So? What tool currently you are working on? 1 click and generate unique content like Article Forge?
     
  9. The Doctor

    The Doctor Jr. VIP Jr. VIP

    Joined:
    Dec 18, 2010
    Messages:
    892
    Likes Received:
    261
    Occupation:
    Computer Scientist, Engineer, Programmer.
    Location:
    ☆☆☆☆☆☆
    Home Page:
    Looking at their thing, it seems to be generating content in a different way than I've developed. Does Article Forge really use deep learning or is it just compiling a bunch of factoids or something?
     
  10. bartosimpsonio

    bartosimpsonio Jr. VIP Jr. VIP Premium Member

    Joined:
    Mar 21, 2013
    Messages:
    12,065
    Likes Received:
    10,837
    Occupation:
    WHEREZ MA
    Location:
    BITCOINS AT?
    Home Page:
    A LOT of news is syndicated from a single source. Reuters and AP stuff show up on dozens of major news sites, same content.
     
  11. BlueSnow

    BlueSnow Regular Member

    Joined:
    Feb 5, 2014
    Messages:
    239
    Likes Received:
    78
    Gender:
    Male
    I have some question related to this:

    1. Which is the best online tool for check copy scape?
    2. Can I safely use duplicate content if I want to use only social network traffic and native ads?
    3. Can I copy AP and Reuters news (or use RSS feed) on my site and use them for social networks traffic?
    4. Can I translate articles from other languages to English via google translate, send them to proofreading and use them on my site?
     
  12. SEO

    SEO Jr. VIP Jr. VIP

    Joined:
    Jan 6, 2017
    Messages:
    766
    Likes Received:
    568
    My answer to this is that it depends on the authority that you're concerned about. If you're concerned about copyright infringement, then no, it isn't safe to use their content. If you don't care about legal stuff, and you don't care about Google traffic, then I don't see why scraped content couldn't work. You'll just need to worry about your web host getting DMCAs and killing off your account.
     
    • Thanks Thanks x 1