1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

I blocked Google and it still indexed me?

Discussion in 'White Hat SEO' started by ShadeDream, Mar 20, 2010.

  1. ShadeDream

    ShadeDream Elite Member

    Joined:
    Nov 27, 2008
    Messages:
    2,209
    Likes Received:
    5,230
    Location:
    He who laughs last, laughs longest.
    I've bought a domain (as an example, hello.com) on which I have set up a wordpress blog. By default I have decided to block Google and all other search engines until I complete my site. While working on something else I have decided to make around 5 backlinks for hello.com while it was still blocking search engines. Today, I googled the domain without the extension and to my surprise it's indexed as "www.hello.com/". Is it because of the backlinks? I mean, it must be because how else would it be indexed, but how is this possible?
     
    Last edited: Mar 20, 2010
  2. ronmojohny

    ronmojohny Regular Member

    Joined:
    Dec 9, 2008
    Messages:
    283
    Likes Received:
    59
    Google ignores the robots.txt file sometimes. I think you can verify ownership in Google Webmaster tools, and then ask them to remove it from the index. (from within webmaster tools)
     
    • Thanks Thanks x 1
  3. ShadeDream

    ShadeDream Elite Member

    Joined:
    Nov 27, 2008
    Messages:
    2,209
    Likes Received:
    5,230
    Location:
    He who laughs last, laughs longest.
    I actually don't want to remove it since I will need it indexed anyway. I was just wondering why this happens.
     
  4. nufaman

    nufaman Elite Member

    Joined:
    May 29, 2009
    Messages:
    1,697
    Likes Received:
    1,185
    How exactly were you blocking it?
     
  5. 195471

    195471 Regular Member

    Joined:
    Oct 11, 2008
    Messages:
    417
    Likes Received:
    260
    There's a difference between spidering and indexing. If you disallow a bot in your robots.txt, that doesn't mean that it will prevent your pages from getting indexed. If you really don't want a page to be indexed, the best thing to do, imho, is to add a meta tag that blocks indexing:

    Code:
    <meta name="robots" content="noindex" />
    You need to add it to every page that you don't want indexed.

    Edit: some reading material...
    Code:
    http://sebastians-pamphlets.com/crawling-vs-indexing/
    http://sebastians-pamphlets.com/smart-robots-txt/
     
    • Thanks Thanks x 1
    Last edited: Mar 21, 2010
  6. ShadeDream

    ShadeDream Elite Member

    Joined:
    Nov 27, 2008
    Messages:
    2,209
    Likes Received:
    5,230
    Location:
    He who laughs last, laughs longest.
    robots.txt

    But as 195471 said, I have just found this on Google: