1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

If Google Crawler follows a deep link to my website, will it first check robots.txt for disallow?

Discussion in 'White Hat SEO' started by Nsr-dj, Jan 20, 2017.

  1. Nsr-dj

    Nsr-dj Newbie

    Jan 20, 2017
    Likes Received:
    I'm aware this is not the best way to prevent a website from being indexed, but I've got a dev environment that is behind password protection. The site is set to noindex and robots.txt disallows all crawlers everywhere (the robots.txt is not behind password protection). I know the crawler can't read the noindex, but it can't read that anyway due to the password protection.

    If erroneously a deep link exists to this dev environment and the crawler follows it, will it first check robots.txt to see if it is disallowed there before indexing? Or will it skip this step and simply index this page.