Hiding a WordPress development website from the Google search engine index is something you need to do and can be done in a couple of ways.
Why you need to do it is that once your pages are indexed they’ll take a while to get rid of and since you are in development stage you’ll probably index surplus, duplicate and incorrect content – something that a client won’t want.
This guide looks at hiding WordPress content via WordPress itself and using a more robust basic authentication method.
Discourage Search Engines
You can set the checkbox option of ‘Discourage search engines from indexing this site‘ in the Dashboard > Settings > Reading, this is supposed to prevent the Googlebot crawler from indexing the site.
The setting that changes to prevent the googlebot crawling is the changes applied to the robots.txt file, normally it is like…
User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php
With the setting’Discourage search engines from indexing this site‘ is checked it is…
User-agent: * Disallow: /
So when checked it is asking search engines not to index the pages – but you are not guaranteed that search engines will honour this setting – but a bigger issue is that you forget to reverse the setting when the site goes live.
Use HTTP Basic Authentication
HTTP Basic authentication is a simple challenge/password set up whereby you cannot see the site but instead see a operating system style dialog box which you need to authenticate to see the site – the browser will cache the username/password combination for a period of time.
What’s good about this is that the search engines can’t get to index your pages and it will be very difficult for you to forget to disable it when you do take the website live.
Doing it in cPanel
cPanel has this authentication builtin in the form of Directory Protection.
This works fine, but you have to create it with a password strength over 70. For this type of authentication and issue I just want the password to be simple – especially for client proofing the site – so you can also set it up manually.
Manually Using htpasswd and htaccess
For non cPanel hosting the manual way is the same as the cPanel way but obviously cPanel does it all for you.
You need to work with 2 files .htaccess and .htpasswd
In .htaccess at the top of the file all you need is this…
AuthName "Protected Area" AuthUserFile "path-to-this-file/.htpasswd" AuthType Basic require valid-user
Key thing is to set the path correctly to the .htpasswd file which will contain user and password. You can actually call this file anything you like.
The .htpasswd contains the username and password and looks like this…
The above generates the user/password combo of password/password – nice and easy for us but stops Google indexing.
You can get the password generated using an online generator, just plugin the username and password and get the resulting code.
That’s it – its a clean solution to not indexing pages whilst the site is in development and also protecting a clients data pre going live on the web.