1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Are you using Wordpress? Then you might have a problem

Discussion in 'Black Hat SEO' started by aliskorn, Jul 14, 2009.

  1. aliskorn

    aliskorn Jr. VIP Jr. VIP Premium Member

    Joined:
    Apr 21, 2008
    Messages:
    484
    Likes Received:
    459
    Occupation:
    Psychologist & programmer
    It seem this is some serious sh*t. It`s about the permalinks structure. Most of us use %postname% or worse %category%%postname%. This is very bad because wordpress is filling the database with more rewrite rules than it should everytime you write a post because it doesn`t know if what it fetches is a page ,a post or a category. This stresses the server and the DB very much when you have hundreds of posts and running on shared server. This can only be avoided if you have a number in your permalinks like this : %year%%postname%. This is a bummer because it is not that seo friendly.
    Anyway,this might be the reason some of you are getting banned from your host. CPU usage! This is the full article :
    Code:
    Over the past several days, there has been an interesting discussion on the wp-testers mailing list (though, it really belonged on the wp-hackers list, but that's beside the point) about permalink structures in WordPress. The original question came from matthijs and questioned why WordPress was storing rewrite rules for every page on his site in a database option. Further discussion revealed that this was a side-effect of his particular permalink structure, and some really good information about good and pad permalink patterns. This information could be important for sites that use non-standard URL structures, and I thought it deserved a summary.
    
    First, let's look at the original question and the situation that brought it about:
    
        Recently I discovered that the current way wordpress handles permalinks is not scalable. All rewrite_rules are at the moment held in a single database field in the wp_options table. If you have a few dozens pages and posts, you have maybe a few hundred rewrite_rules in it and all is well. But as soon as you start to have a few hundred pages and attachments, the amount of rewrite_rules explodes as well as the field size. This also depends on the permalinks settings. On one of my sites I can't even open the database field to take a look because my browser and text editor crash because of its size.
    
    Before anyone starts to panic, let me that this is not a general problem in WordPress. This person had a particular permalink structure which forced WordPress to store extra rules for every page. This is a situation which can be avoided by choosing a permalink pattern which allows WordPress to find your posts in an efficient way.
    
    WordPress gives site builders a lot of flexibility in how their post URLs are created. There are several attributes which can be used, and ordered how the person likes. The default "pretty permalink" structure looks like this:
    
    /%year%/%monthnum%/%day%/%postname%/
    
    Which results in perlink URLs that look like:
    
    http://example.com/2009/01/22/hello-world/
    
    There are several structure tags which can be used to form permalinks: %year%, %monthnum%, %day%, %hour%, %minute%, %second%, %postname%, %post_id%, %category%, %tag%, and %author%. As mentioned earlier, this gives a lot of flexibility in how your URLs can appear. However, Ryan Boren pointed out:
    
        Verbose rules are used for structures beginning with %category%, %tag%, %postname%, and %author%.  Avoiding such structures is best.
    
    This important note was subsequently added to the Codex page about Using Permalinks:
    
        For performance reasons, it is not a good idea to start your permalink structure with the category, tag, author, or postname fields. The reason is that these are text fields, and using them at the beginning of your permalink structure it takes more time for WordPress to distinguish your Post URLs from Page URLs (which always use the text "page slug" as the URL), and to compensate, WordPress stores a lot of extra information in its database (so much that sites with lots of Pages have experienced difficulties). So, it is best to start your permalink structure with a numeric field, such as the year or post ID.
    
    This would be a problem for any dynamic CMS, not just WordPress. If there isn't some way to narrow down the information in the URL and map it to a specific page or post, the system must perform a lot of database searches to find the correct entry. Otto provides a really good hypothetical example:
    
        Actually, I think this deserves a bit more discussion... Let's consider a permalink like %category%/%postname%.
    
        So you're handed a URL like /mycat/mypost. You start by parsing it into mycat and mypost. You don't know what these are. They're just strings to you. So, first, you have to consider what "mycat" is.
    
        First, you query to see if "mycat" is a pagename. This is a select from wp_posts where post_slug = mycat and post_type = page. No joy there.
    
        Next, you query to see if "mycat" is a category. This is a select from wp_terms join wp_term_taxonomy on (term_id = term_id) where term = mycat and taxonomy = category. Hey, we found a mycat, so that's good. Unfortunately, this just tells us that it's a category, which is rather useless in retrieving the actual post we're looking for. So we ignore the category.
    
        Now, we move on to the "mypost". Again, we start querying:
        1. Is it a page?  select from wp_posts where post_slug = mypost and post_type = page. Nope.
        2. Is it a category?  select from wp_terms join wp_term_taxonomy on (term_id = term_id) where term = mypost and taxonomy = category. Nope.
        3. Is it a post? select from wp_posts where post_slug = mypost and post_type = post. Bingo.
    
        The whole goal is to determine the specific post being asked for. The category is not helpful in this respect, and we have to do a couple queries just to figure out that we need to ignore it. Five queries to determine what the post is with this structure. Five queries, two of them expensive (joins ain't cheap). And these have to happen on every load of a post on your site.
    
    Otto then goes on to explain that this isn't what WordPress actually does. Instead, when WordPress detects that you have an inefficient permalink structure, it stores extra rewrite rules in an option in the database, which it then refers to when presenting a page.
    
    To finish up, let's look at a couple of quick examples.
    
    Bad:
    
    /%postname%/%post_id%/
    /%category%/%postname%/
    
    Better:
    
    /%post_id%/%postname%/
    /%year%/%category%/%postname%/
    
    In conclusion, when building a site's permalink structure, choosing carefully can help WordPress locate your articles in the most efficient way possible.
    I`m wonder if anyone can open their database in the browser :D


    LE: the link is here : http://dougal.gunters.org/blog/2009/02/04/efficient-wordpress-permalinks . Excerpt from comments :"Nothing to worry about, nobody uses %postname% structure" LMAO.
    LE2: in fact, I think there are very few people using permalinks like this, otherwise this would have been out sooner. This is from wp codex : "Starting Permalinks with %postname% is strongly not recommended for performance reasons" link : http://codex.wordpress.org/Using_Permalinks
     
    Last edited: Jul 14, 2009
  2. platinvm

    platinvm Senior Member

    Joined:
    Jan 31, 2009
    Messages:
    1,042
    Likes Received:
    384
    Location:
    TN
    Damn this sucks, i have a blog with 30k posts that has /%category%/%postname%/ but its been like that for a couple of weeks and I have not had a problem yet.
     
  3. smaltouchoffaith

    smaltouchoffaith Newbie

    Joined:
    Jun 26, 2009
    Messages:
    32
    Likes Received:
    4
    you have got to be kidding me. and everyone rants and raves about wordpress. hmm...now trying to think what I am going to do :)
     
  4. whynot

    whynot Registered Member

    Joined:
    Oct 3, 2007
    Messages:
    77
    Likes Received:
    130
    Hey Aliskorn, thanks for the heads up on this.

    Bad:

    /%postname%/%post_id%/
    /%category%/%postname%/

    Better:

    /%post_id%/%postname%/
    /%year%/%category%/%postname%/

    I wonder if /%year%/%postname%/ would be okay?

    Could you post the link the article so we may follow what is being said there?

    Thanks
     
  5. aliskorn

    aliskorn Jr. VIP Jr. VIP Premium Member

    Joined:
    Apr 21, 2008
    Messages:
    484
    Likes Received:
    459
    Occupation:
    Psychologist & programmer
    Dedi server I suppose?

    Post edited, link added. Also,yes starting the permalink with %year% is okay.
     
  6. black1411

    black1411 Junior Member

    Joined:
    Dec 14, 2008
    Messages:
    158
    Likes Received:
    65
    there's plugin you can use to solve this. Try google wp permalink redirection plugin and you'll be able to redirect the old links to the new links. That's what i did with one of my blog.

    Another good plugin is one that stop auto pinging otherwise ur wp site will be blacklisted by ping services for too many pings.

    Last tip, find a way to stop auto revision post saving if you wanna stop ur blog from being overload, Again, you'll find it on the big G.
     
    Last edited: Jul 14, 2009
  7. lakeeffect

    lakeeffect Newbie

    Joined:
    May 10, 2009
    Messages:
    48
    Likes Received:
    12
    Occupation:
    Affilate Marketer, Rocker
    Location:
    SoCal
    Before everyone starts abandoning wordpress, I want to point out that this article/news is BULLSHIT! %category%/%postname%/ is an ideal permalink structure. If you're worried about rewrite rules, then run Wp-Super Cache so you have a static page that doesn't tax the system resources. I have over 400 wordpress installs this way and have never had an issue, even when on the front page of Digg, getting 10K views an hour for days. Whoever authored this is an amateur that doesn't know shit about cache.
     
    • Thanks Thanks x 1
  8. Sweetfunny

    Sweetfunny Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 13, 2008
    Messages:
    1,747
    Likes Received:
    5,039
    Location:
    ScrapeBox v2.0
    Home Page:
    Storing a few extra kilobytes of data is not a problem, what causes a performance hit is the type of query, how many rows & tables it scans etc. Overhead from mapping the postname is "nothing" compared to most plugins.

    I find it laughable people would worry about this when they run a million plugins for dumb shit like putting analytics on their blog when you can just open the footer and paste it in.

    I've got a blog with just under 1/4 million posts using just /postname and there's zero noticeable difference than if it had 1 post. What "will" lag it is if i run similar posts plugin for example, and it's got to scan 1/4 Mil posts and pick 5 similar for each page loaded.
     
  9. nufaman

    nufaman Elite Member

    Joined:
    May 29, 2009
    Messages:
    1,697
    Likes Received:
    1,185
    And then you have people that read that crap and go running to post it on the forums...

    The sky is falling! The sky is falling!
     
  10. aliskorn

    aliskorn Jr. VIP Jr. VIP Premium Member

    Joined:
    Apr 21, 2008
    Messages:
    484
    Likes Received:
    459
    Occupation:
    Psychologist & programmer
    Thanks for your input. Do you store it on one box only?