Saturday, September 17, 2011

Robots.txt or Meta NoIndex Tag?

For SEO professionals or webmasters that are managing their own online marketing campaigns, understanding what pages to greater emphasize to the search engine spiders is a key element in setting up a website to succeed in the search engine results pages (SERPs). Utilizing the robots.txt file and the meta noindex tag are two ways to do this, by indicating what pages should be blocked out from a search engine's index. Ultimately, the faster the search engine spiders are able to crawl and navigate a website the better. And a great way to do this is to send them to the most important pages of the site by indicating exactly where they don't need to be. The question becomes - what is a better way of doing this, blocking out pages by adding them to the robots.txt file or applying the meta noindex tag to the HTML coding of of the web pages?

Individual or Multiple Pages?

The first question one should ask is, "Am I trying to block out an individual web page from being accessed by the spiders, or do I want to block out an entire sub-directory of web pages?" Typically, if you are trying to block out a sub-directory, it is easiest to include that in the robots.txt file as that is the faster and more efficient way of blocking out multiple pages at once. On the other hand, if you have one particular page on your site that you do not want the spiders crawling, it may be easiest to apply the meta noindex tag in the HMTL coding of that particular page. While that seems easy enough, a more important question must be asked before making a decision.

Are the Web Pages Indexed Already?

Within the SEO community, there has been quite a bit of speculation as to whether or not the robots.txt or meta no index tag is better at preventing a web page from being indexed by a search engine. But the better question is whether or not the page in question is already indexed or not. Understanding what to do if a page is already included in a search engine's index will dictate what choice you make. If the page is in fact indexed, the meta noindex tag is a better option, as the robots.txt file does not do a great job of kicking a web page out of a search engine's index. In a best case scenario, it will take quite a long time and the only thing you can do is kick back and hope that it happens.

Meta NoIndex Tag & On-Page Links

For webmasters or SEO professionals that are managing a client's campaign that decide to apply the meta noindex tag to a web page, an additional thing to consider is what to do with the links that appear on that page. Do you want the search engine spiders crawling those links? Or would it be better that they did not, for example, if those links pointed to external sites or pointed to internal pages that are also being blocked out from the search engines. Here are the primary tag options, both implemented within the HTML coding of the page in question:

meta noindex, follow: this tag tells the search engine spiders not to index the page, but to follow any links that are embedded within that page.

meta noindex, nofollow: this tag tells the search engine spiders not to index the page, and not to follow any links that are embedded within that page.


As I pointed out above, the current status of the web page or web pages in question to be blocked out or not really determines what direction one should go in. SEO professionals and advanced webmasters have determined that both are successful in blocking pages out from the search engines that are not indexed, while the meta no index tag may be more successful in kicking pages out of a search engine's index. And when using the meta noindex tag, don't forget to tell the search engine spiders what to do with those on-page links!

No comments:

Post a Comment