Ok, so we’ve seen how to password protect directories to keep the web crawlers out, but I don’t want to go through that. I want to keep the page open, but I don’t want it spidered and indexed by the bots.
There are ways for doing this too. In fact there are several. The most commonly accepted and respected way of telling a bot not to crawl certain areas of a website is with what’s called a robots.txt file. Usually this is put in the same folder as your main site index and looks like this.
The above will keep all robots out of your site. This might be too heavyhanded though. Let’s say the msnbot has been a bit too voracious with your downloads area
That should be enough to keep it out of that folder. Here’s another example.
You can get more complicated than this if you need to.
Here’s a link to googles robots.txt file
To exclude a specific file from being indexed, you might try the following meta tag in your document.
you can also use index and follow to fine tune what you want to restrict or allow.
I’m not certain that the respecting of a meta tag is widely held. robots.txt is more likely to be followed.
To just exclude the googlebot, you might try this…
according to Googles page on removing pages from the index
Google apparently will respect that tag and that would allow other bots through.
Related PostsRelated Posts
- Creating a redirect page This is one that comes in handy a lot. Like many things in computing there are a number of ways to accomplish this. My favorite though is one fo the simplest. But first, it's probably worth asking why you would want a redirect page and just clarify what I mean.......
- The Google Problem, or why I'm starting to use MSN and Yahoo more. This weekend has been a bit of an introspective for me on why google is still the primary search engine I use. I know, I've been a big "fan(?)" of google for quite some time, I've obviously incorporated many of their products into my pages and used Google for 99%......
- Search engines to blame for malware spread? There are a couple news stories about a McAfee SiteAdvisor report about the search engines responsibility for sites that distribute malware. McAfee said Friday that the epidemic of spyware and viruses could be linked to search engines. According to research from the company, even seemingly benign search terms could bring......
- To Make Certain Each Web Page Of One's Site Is Listed In Search Engines Like Google Even when you desire to internet search engine boost your website, write intended for followers initially and appearance motor robots next. Google, Live messenger, Yahoo, and so on., are in possession of some really smart software creeping the web, however spiders don't buy merchandise inside online retailers, join news letters......
- Keep Those Spammers Out With .htaccess File Spammers possess a skill for creating overrides for you to even probably the most guaranteed aspect of the system such as these which are not readily acknowledged as potential locates. The .htaccess file can be used to preserve e-mail harvesters away. That is considered extremely successful since all of these......
- Select A Search Engine Optimization Company Tips A Search Engine Optimization Company is definitely an invaluable asset with your Online marketing campaign. They specialize in knowing how to improve your search engine positions,monitoring those positions on the regular basis, and adjusting their approaches to account for undesirable results in any given month. Because this requires a great......
- Roll your own search engine… sort of…
- Interesting spyware push download tactic…
- Saving you from yourself or specifying which index file to use with apache
- Wget user agent avoidance
- Creating a redirect page