You have several techniques to solve this, each with its advantages and disadvantages. By its description just use robots.txt
which is a file that tells search engines that you do not want that content to be indexed by them .
The presence of this file does not guarantee anything, but the most well-known search engines respect this. If you need guarantees, you'll need to use a protection mechanism requiring at least a basic authentication before people access.
It is possible to use the meta
technique put in the other answer but it does work. It has to be placed on every page, and if one day need to change it in site , it has to change in all files. You can tell in%% of which pages will be affected in a centralized way. It's much better.
In some cases it may be better or the only way to do it, but rarely is the case. It should be a secondary solution. Still, it's better to do this:
<meta name="robots" content="noindex, nofollow">
You can also control this on the HTTP server. But there are rare cases where this is more interesting. You can use this element in the protocol header:
X-Robots-Tag: noindex