1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Can google recognize index.asp as the default file??

Discussion in 'White Hat SEO' started by xiangzhanger, Jul 20, 2009.

  1. xiangzhanger

    xiangzhanger Newbie

    Joined:
    Feb 26, 2009
    Messages:
    19
    Likes Received:
    3
    In Google webmaster tool, it says can't reach the index.html, but the index.html file never exists, or else i don't know, only index.asp. And it explains that links (back links) to the homepage (that is http://www.XXXX.com/index.html) can't be reached , LOL. Is it just temporary because of google's history index? Or i should make some changes somewhere.


    P.S Google didn't indexed http://www.XXXX.com/index.html nor http://www.XXXX.com/index.asp, means i can't search and find them both. But http://www.XXXX.com is OK.


    Thanks
     
  2. nytrolic

    nytrolic Registered Member

    Joined:
    Oct 7, 2008
    Messages:
    86
    Likes Received:
    35
    I didnt write this text, i found it on the net somewere, anyway, maby its helpful to you.

    This is the most basic of SEO problems. It is the first thing that every SEO should tackle for their site as it leads to other problems that will limit your ability to rank well. Which would lead me to question why if you are an "SEO freelancer" as you say that you do not already have an excellent understanding of the problem itself, why it is bad, and how to fix it... But whatever... I'll explain it again here for the 100th time at least. LOL

    Google and the other engines rank URLs. They don't rank sites... they don't even rank web pages. Every unique URL is considered by the search engines to be a different 'page' in their index. Most sites out there not professionally SEO'd have canonical issues because they are not aware of this fact. And half of the so called SEOs out there don't even understand it.

    Every 'page' on your site should have one and ONLY one URL. This is called the canonical URL or preferred URL.

    For example,

    http://example.com/
    http://example.com/index.html
    http://www.example.com/
    http://www.example.com/index.html

    might all be URLs for your home page. Google and the other engines see these as 4 different 'pages' because each has a different URL. This leads to a couple of problems that affect your rankings for particular keyword phrases - 1) duplicate content and 2) split page rank/link juice.

    It leads to duplicate content because your site serves up the exact same content (your home page) under all 4 URLs. So one of the 4 URLs (you have no way of knowing which) gets flagged as the original version of the content and the other 3 get flagged as duplicate. For the 3 duplicate versions of the home page, all ranking factors that are based on the content of the page are devalued in the ranking algorithm.

    Since Google sees them as 4 different pages, if they each have 10 inbound links from 10 different sites then what you have is 4 URLs with 10 inbound links each.

    The way to fix this is to decide on some rules of how to determine which URL is the canonical or preferred URL. This usually means making decisions like:

    - www vs non-www
    - show trailing '/' when referencing folders w/ default documents or hide the trailing '/'
    - show default document filename when referencing folders w/ default documents or hide the default document name
    - if you support https as well then which pages should be https and which should be http (don't allow a single page to get indexed as both)

    It doesn't matter which rules you decide on for constructing canonical URLs as long as you decide on the rules and enforce them across your site w/ 301 redirects. I always choose www, show trailing '/', and hide default document name. So my preferred canonical URL in the above example would be http://www.example.com/ but that is just my preference.

    To fix the canonical issues you simply redirect all other non-canonical URLs to the canonical URL similar to the following:

    http://example.com/ --> 301 redirect --> http://www.example.com/
    http://example.com/index.html --> 301 redirect --> http://www.example.com/
    http://www.example.com/ Canonical URL No Redirect Required
    http://www.example.com/index.html --> 301 redirect --> http://www.example.com/

    Now Google will give your canonical URL credit for all inbound links to the other 3 URLs as well as giving it credit for the link text used to link to the other 3 non-canonical URLs. This means the PR will be passed from the other 3 non-canonical URLs to the canonical. The redirects also cause the other 3 non-canonical URLs to drop out of the index.

    So now Google sees your home page http://www.example.com/ as a single URL with 40 inbound links instead of 4 different URLs with 10 links each.

    This eliminates duplicate content issues on your site and split page rank. Your home page will gain some PR because it's getting credit for 4 times as many inbound links and hopefully because of the additional links w/ relevant link text it will rank better for the terms other sites are using in the links. It now has 4 times as many link texts to be considered for keyword rankings in the SERPs and 4 times as many potentially relevant refering pages to be considered.
     
  3. xiangzhanger

    xiangzhanger Newbie

    Joined:
    Feb 26, 2009
    Messages:
    19
    Likes Received:
    3
    thanks, so i should 301 redirect http://www.homepage.com/index.html to http://www.homepage.com/index.asp? but the index.html never exists. i don't think it will do any good
     
  4. soulchief

    soulchief Junior Member

    Joined:
    Oct 17, 2007
    Messages:
    117
    Likes Received:
    55
    Location:
    Canada
    You can always use mod_rewrite so index.html would be the same as index.asp

    .htaccess
    Code:
    Options +FollowSymlinks
    RewriteEngine on
    RewriteRule ^index.html$	index.asp [L]