1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Expert Advice Needed on Robots.txt

Discussion in 'White Hat SEO' started by jhakasseo, Feb 3, 2014.

  1. jhakasseo

    jhakasseo Senior Member

    Joined:
    Mar 1, 2012
    Messages:
    827
    Likes Received:
    237
    Occupation:
    Internet Marketing
    Location:
    On d Earth
    Home Page:
    Hello BHW,

    I have portal website that is hosted in amazon with apache.
    I want to block some urls from Google so I have included those into robots.txt

    User-agent: *
    Disallow: /load_cal_ajax/
    Disallow: http://prod.example.com
    Disallow: http://example.com/search/get_cat_by_city/
    Disallow: http://example.com/search/get_cat_by_city/Goa
    Disallow: http://example.com/search/load_venue_by_cat/Bangalore
    Disallow: http://example.com/search/load_venue_by_cat/Hyderabad
    Disallow: http://example.com/search/load_venue_by_cat/Goa
    Disallow: http://example.com/search/load_venue_by_cat/Mumbai
    Disallow: http://example.com/welcome/load_cities
    Disallow: http://example.com/welcome/load_cities/Bangalore
    Disallow: http://example.com/welcome/load_cities/Goa
    Disallow: http://example.com/welcome/load_cities/Pune
    Disallow: http://example.com/pagenotfound
    Disallow: http://example.com/m/

    Is it right way to block urls to be index. I uploaded this file a month ago but these urls keep indexing in google.
     
  2. MattsBackpack

    MattsBackpack Newbie

    Joined:
    Nov 1, 2012
    Messages:
    18
    Likes Received:
    3
    Occupation:
    Digital Marketing Consultant
    Location:
    mattsbackpack.co.uk
    You'd be better off placing the <NOINDEX> tag in the header of the pages you don't want indexed.

    That way they have to read the tag instructing them not to index the page because it's on the page itself. So you're not relying on them to read and comply with your robots.txt file.

    That being said, in theory at least, Gbot should normally always look for and abide by the instructions in your robots file, so I'm surprised that these URLs are being indexed
     
  3. ChrisX

    ChrisX Jr. VIP Jr. VIP Premium Member

    Joined:
    Oct 8, 2011
    Messages:
    229
    Likes Received:
    124
    Gender:
    Male
    Home Page:
    You don't need to include the full url in robots.txt, so your first directive is actually correct. For the others simply remove http://example.com and start with a leading /.

    Additionally, if you're using webmaster tools, you can remove the urls from there.
     
  4. jhakasseo

    jhakasseo Senior Member

    Joined:
    Mar 1, 2012
    Messages:
    827
    Likes Received:
    237
    Occupation:
    Internet Marketing
    Location:
    On d Earth
    Home Page:
    the interesting part is there is option call crawl error >> blocked urls.
    It shows 0 blocked urls?
     
  5. davethompson

    davethompson Newbie

    Joined:
    Feb 3, 2014
    Messages:
    33
    Likes Received:
    4
    Full URL is not required for the robots.txt file you can copy-paste below coding in your robots.txt file or you can manually use Nofollow, NoInddex meta tags on the particular pages.


    User-agent: *
    Disallow: /load_cal_ajax/
    Disallow: /search/get_cat_by_city/
    Disallow: /search/get_cat_by_city/Goa
    Disallow: /search/load_venue_by_cat/Bangalore
    Disallow: /search/load_venue_by_cat/Hyderabad
    Disallow: /search/load_venue_by_cat/Goa
    Disallow: /search/load_venue_by_cat/Mumbai
    Disallow: /welcome/load_cities
    Disallow: /welcome/load_cities/Bangalore
    Disallow: /welcome/load_cities/Goa
    Disallow: /welcome/load_cities/Pune
    Disallow: /pagenotfound
    Disallow: /m/
     
  6. jhakasseo

    jhakasseo Senior Member

    Joined:
    Mar 1, 2012
    Messages:
    827
    Likes Received:
    237
    Occupation:
    Internet Marketing
    Location:
    On d Earth
    Home Page:
    Thanks to all of you but I want to block it from Robots.txt
    And is it strange issue, Have anyone had this kind of issue ever with 0 blocked urls in Google webmaster Tool.
     
  7. lmxftw

    lmxftw BANNED BANNED

    Joined:
    Mar 11, 2013
    Messages:
    1,010
    Likes Received:
    1,179
    `ere u go mate

    http://www.mcanerin.com/EN/search-engine/robots-txt.asp
     
  8. jhakasseo

    jhakasseo Senior Member

    Joined:
    Mar 1, 2012
    Messages:
    827
    Likes Received:
    237
    Occupation:
    Internet Marketing
    Location:
    On d Earth
    Home Page:
  9. Conor

    Conor Jr. VIP Jr. VIP

    Joined:
    Nov 7, 2012
    Messages:
    3,361
    Likes Received:
    5,423
    Gender:
    Male
    Location:
    South Africa
    Home Page:
  10. jhakasseo

    jhakasseo Senior Member

    Joined:
    Mar 1, 2012
    Messages:
    827
    Likes Received:
    237
    Occupation:
    Internet Marketing
    Location:
    On d Earth
    Home Page:
    What about directories. I want to block directories.