Robots meta directives, also known as “meta tags”, are code snippets that provide instructions to web crawlers on how to crawl and index webpage content. While robots.txt file directives offer suggestions to bots on how to crawl a website’s pages, robots meta tags provide more definitive instructions on how to crawl and index a page’s content.
Types of robots meta tags
There are two types of robots meta directives: those that are part of the HTML page, such as the meta robots tag, and those that are sent as HTTP headers by the web server, such as x-robots-tag. Both meta robots and x-robots-tag can use the same parameters, such as “noindex” and “nofollow”, to provide crawling and indexing instructions. The difference lies in how those parameters are communicated to crawlers.
Meta directives provide instructions to web crawlers on how to crawl and index the content found on a specific webpage. If these directives are detected by bots, their parameters can strongly influence crawler indexation behavior. However, as with robots.txt files, web crawlers are not required to follow your meta directives, so some malicious robots may choose to ignore them.
Here are some parameters that search engine crawlers can understand and follow when they’re used in robots meta directives. Although these parameters are not case-sensitive, it’s worth noting that some search engines may only follow a subset of these parameters or may interpret some directives differently.
Controlling parameters for indexation:
- Noindex: Informs search engines that a page should not be indexed.
- Index: Notifies a search engine that a page has been indexed. This meta tag is not required because it is the default setting..
- Follow:Crawlers should follow all links on pages even if they are not indexed in order to pass credit to the pages they are linked to..
- Nofollow: This command instructs a crawler not to follow any links on a page or transfer any link equity.
- Noimageindex: Informs a crawler that no images on a page should be indexed.
- None: The same as employing the noindex and nofollow tags simultaneously.
- Noarchive: Informs search engines that a cached link to this page should not be displayed on a search engine results page (SERP).
- Nocache: Nocache is the same as noarchive, but it is only supported by Internet Explorer and Firefox.
- Nosnippet: Tells a search engine not to display a snippet of this page on a SERP (i.e. meta description).
- Noodyp/noydir [OBSOLETE]: Stops search engines from using the DMOZ description of a page as the SERP snippet for this page. However, DMOZ was decommissioned in early 2017, rendering this tag obsolete.
- Unavailable_after: Informs search engines that this page will no longer be indexed after a certain date.
When a URL is crawled, all meta directives (robots or otherwise) are discovered. However, if a robots.txt file disallows crawling of the URL, any meta directive on the page (whether in the HTML or the HTTP header) will be ignored.
In most cases, it’s better to use a meta robots tag with the “noindex, follow” parameters instead of using a robots.txt file to restrict crawling or indexation.
It’s important to keep in mind that malicious crawlers may ignore meta directives, so this protocol is not a reliable security mechanism. If you have confidential information that you don’t want to be publicly searchable, consider using a more secure approach like password protection to restrict access to those pages.
Using both meta robots and the x-robots-tag on the same page is unnecessary and redundant.
To enhance your knowledge on Robots Meta Tags , consider attending our Digital Marketing and Growth Hacking session. Register for the webinar now by clicking on the link below.
You can also download premium learning’s app from the link below