Plugin for preventing duplicate posts in WordPress

autoblogged

Registered Member
Joined
Mar 7, 2009
Messages
88
Reaction score
63
Many of you who have worked with some kind of WordPress autoblogging tool have run into the issue of duplicate posts at one time or another.

It sounds like it would be an easy problem to fix but it is surprisingly difficult to prevent and we have been fighting this issue for years. And while we have for the most part had success fixing this, WordPress always seems to change something that brings the problem back.

Part of the problem is that WordPress can be doing several things at once and sometimes there are delays while WordPress handles things like trackbacks and pings. You also have concurrency issues on busy sites.

We did, however, come up with a pretty good solution recently by addressing duplicate posts outside of AutoBlogged and later in the posting process.

We created a simple plugin that hooks into wp_insert_post_data, which is the last thing WordPress calls before inserting the post into the database. We simply look to see if the post name has a "-2" suffix after it and if the post title itself doesn't also end with a "2" we kill the post.

Install and activate it like you would any WordPress plugin. There is no configuration necessary, it just starts working.

It works surprisingly well and should work with any autoblogging plugin out there. The only drawback is that you can't ever have any two posts, even manual posts, with the exact same title :)

Anyway, we'd thought we would share it with BHW.
 

Attachments

Last edited:
I use this "Duplicate Posts Eraser", pretty good plugin, specially since Youtube Posts is going crazy sometimes and makes few 100s duplicates. This one goes behind your back and does all the dirty job for you. You can find it on WP's site
 
Yes, we have tried that and similar approaches as well. One problem is that database caching can prevent it from working properly.

Also, if you look what that plugin does, it checks EVERY post title in your database to make sure it doesn't match any other post title. It does this each time you add a new post so if you are adding 20 (or more) posts at once you are putting a huge load on the database. When you do that on a busy site or a site with lots of posts, that will bring your blog to a standstill. And it still doesn't work perfectly.

We even wrote similar code into AutoBlogged and eventually abandoned it because it still wasn't catching all duplicate posts on some servers

The plugin we wrote adds practically zero load on the site because it simply compares two strings and it doesn't perform any database queries.
 
The plugin we wrote adds practically zero load on the site because it simply compares two strings and it doesn't perform any database queries.

Thanks for the share and for considering server load. Was always a big problem on my AB's, and now that I'm back into them a bit, maybe this will help.
 
I must say that overall I am really impressed with this blog. It is easy to see that you are passionate about your writing.
 
Wish I had found this thread earlier. This is exactly what I have been looking for. I have tried every duplicate post plugin with no luck on my article directory. Unfortunately I recently tried to tidy up the duplicates in the database using msql and have managed to empty the wp_posts database and although I have backed it up I have been unable to restore it since.

If anyone can help with this it would be greatly appreciated!
 
have someone tried Delete Duplicate Posts? i tried it while back, but it get conflicted with my WP ROBOT so i deactive it ... btw is this plugin (xtra-dupecheck.zip) work with wp robot ? some suggestion please, duplicate plugin thats work with Wp Robot ...

Thank for OP for shared this plugin :)
 
I have been trying the plugin on my article directory website with no luck. In fact I think I have tried every plugin out there to sort out my dupe posts.

I even tried modifying the database but that turned out a bit of a disaster too!

URL:
Code:
www.original-articles.com
 
Thanks. I've been using 'Delete Duplicate post' plugin and just switched to yours.

Quick question.. the php file is fairly straightforward and consists of a function.. Can this is be moved over to functions.php in order to reduce on the number of plugins?
 
does not work with 3.1.3

Warning: array_keys() [function.array-keys]: The first argument should be an array in /home/hellasne/public_html/wp-includes/wp-db.php on line 1203

Warning: Invalid argument supplied for foreach() in /home/hellasne/public_html/wp-includes/wp-db.php on line 1205

Warning: implode() [function.implode]: Invalid arguments passed in /home/hellasne/public_html/wp-includes/wp-db.php on line 1214
 
Best One Indeed! Great plugin and special thanks. Trying them now ..
 
does not work with 3.1.3

Warning: array_keys() [function.array-keys]: The first argument should be an array in /home/hellasne/public_html/wp-includes/wp-db.php on line 1203

Warning: Invalid argument supplied for foreach() in /home/hellasne/public_html/wp-includes/wp-db.php on line 1205

Warning: implode() [function.implode]: Invalid arguments passed in /home/hellasne/public_html/wp-includes/wp-db.php on line 1214


Are you sure this is caused by this plugin? We have this built in to autoblogged now and haven't seen any problems.
 
I have been trying the plugin on my article directory website with no luck. In fact I think I have tried every plugin out there to sort out my dupe posts.

I even tried modifying the database but that turned out a bit of a disaster too!


Same.

Anybody have a solution for this?
 
jeez :) actually I`m fighting with this damn problem since ... huh ... I can't even remember since when lol . One short and sweet question , I didn't tested yet your plugin to be honest but I have a very ugly question , what happens if the post already have duplicate content :D? ... like same posts names just followed like usual by ''-2''/''-3'' and so on ... :) can you make it to delete those posts ? like you know duplicate posts comes with the exactly same name if there will be an way to perform an cleanup into our posts will be simply gorgeous :)
 
Thanks! Using this plugin with WP 3.3.2 and it works almost perfect. There's only one thing I'm trying to fix:

I'm using FeedWordPress and when FWP tries to add a dupe, it is indeed stopped by Xtra Dupecheck. The only problem is that FWP then gives the following errors, for each blocked duplicate post:

Code:
[B]Warning:  Invalid argument supplied for foreach() in /home/siteroot/public_html/site/wp-includes/wp-db.php on line 1206
 
Warning:  implode() [function.implode]: Invalid arguments passed in /home/siteroot/public_html/site/wp-includes/wp-db.php on line 1215
 
Warning:  array_keys() [function.array-keys]: The first argument should be an array in /home/siteroot/public_html/site/wp-includes/wp-db.php on line 1204[/B]

I know these errors are caused by "function _insert_replace_helper" in wp-db.php, because this function is not getting the expected "$data" values when a duplicate post is being blocked. As Xtra Dupecheck returns "false" when it finds a dupe (instead of returning $data). With remove_filter I tried to filter out "_insert_replace_helper", so it doesn't run at all when a duplicate post is blocked. Yet it keeps running and causing the errors.

I'm aware of the fact that not supplying "_insert_replace_helper" with $data, is essentially what prevents dupes from being inserted into the WP database.

I noticed someone else is also getting the same errors. Hopefully the above details will aid in fixing this issue.

In short: Does anyone know how to stop "function _insert_replace_helper" completely, when a duplicate post is being blocked? As this will get rid of the errors.
 
It sounds like it would be an easy problem to fix but it is surprisingly difficult to prevent and we have been fighting this issue for years. And while we have for the most part had success fixing this, WordPress always seems to change something that brings the problem back.

Part of the problem is that WordPress can be doing several things at once and sometimes there are delays while WordPress handles things like trackbacks and pings. You also have concurrency issues on busy sites.

Use http://en.wikipedia.org/wiki/Cron and there will be no such a problem at all.
 
Or u can use canonical option from all seo pack or code it manually.

<?php if ( is_single() ) { ?>
<link rel="canonical" href="<?php the_permalink(); ?>" />
<?php } ?>

In Head tags of /theme/header.php

So even if somebody generates another url for a specific page, the header will tell that which is the original (since its get the info from the DB), so no duplicate content problem from a SEO point of view.
 
Back
Top