When it comes to WEB systems What would be the safest and most used way for it to become invisible to search engines? I know there are "Goals" but I believe there must be more reliable ways
When it comes to WEB systems What would be the safest and most used way for it to become invisible to search engines? I know there are "Goals" but I believe there must be more reliable ways
Before indexing a site, search engines generally look for a file called robots.txt and follow the guidelines contained in this file.
Example: Before visiting www.mysystem.com the crawler will search www.mysystem.com/robots.txt
You can say that the site should not be indexed by placing the following content in robots.txt:
User-agent: *
Disallow: /
You can also allow the site to be indexed with exceptions:
User-agent: *
Disallow: /protegido/
Disallow: /secreto/
Be careful in this last case: At the same time you are preventing these folders from being indexed, you are also making them "public".
Malicious robots will not respect these directives and will likely send a User-Agent string as if they were a regular browser, in this case only CAPTCHAs may prevent them, but this is a more intrusive solution as real users are also affected.