If you want your web site protected, look into using .htpasswd (servers running Apache can use this). However that may not be what you want, perhaps you just want it hidden from Google and the other search engines. In that case you will want to use a robots.txt file. I won’t go into that in detail since others have covered it very well, but basically you can tell web crawlers/spiders on what folders to ignore or to ignore your whole site.

The following tells all spiders not to index/browse your site.

User-agent: *
Disallow: /

If you want you can prevent only the Internet Archive (Wayback Machine) from caching your site use this for your robots.txt

User-agent: ia_archiver
Disallow: /

You can also use the following META tag.

<meta name="robots" content="noarchive, noindex, nofollow, noimageindex, noimageclick" />

I like to use that in combination with the robots.txt file. There is also a META tag just for the Googlebot.

I usually like to do is disable the floating toolbar that IE gives you for really big images, while its not going to keep your site private, its just an annoyance to me.

<meta http-equiv="imagetoolbar" content="no" />

Another useful thing to do is when you register your domain name, is to pay the few extra dollars for Private Registration, while this doesn’t keep your web site private, you are less likely to get email and snail mail spam. Even though people are not supposed to use ICANN for spam purposes, some people do anyway.


