ANU Home | Search ANU | Directories
The Australian National University
Web Search@ANU

Controlling who can index your site

Search engines have programs called Robots and Crawlers that can read the contents of your web site and send indexing information back to the search engine. 

You can set up a file in your web site called Robots.txt to control the way that Robots and Crawlers can access your site.  If your site is already restricted to ANU Only (or similar) then external search engines will not have access to your pages.

If you only want the ANU Search Engine (Funnelback) to index your pages, and not other search engines:

  1. Create a file called Robots.txt
  2. Insert the following text:
      User-agent: FunnelBack
      Disallow:
      User-agent: *
      Disallow: /
  3. Save the file at the top (root) level of your web site.

The code you have inserted into the Robots.txt file means that the ANU Search Engine (User-agent: Funnelback) is allowed to search your whole site (because there are no values after Disallow).  All search engines (User-agent:*) are not allowed to search anything on your site from the root folder downwards (Disallow:/).

There are more options for disallowing search engines such as specifying particular search engines, or only disallowing certain folders.  There are a number of tutorials on this topic available on the web.  Just search for Robots.txt in any search engine to find them.