Webmaster Papers








How to Prevent Duplicate Content with Effective Use of the Robots.txt and Robots Meta Tag


Duplicate content is one of the problems that we regularly come across as part of the search engine optimization services we offer. If the search engines determine your site contains similar content, this may result in penalties and even exclusion from the search engines. Fortunately it's a problem that is easily rectified.

Your primary weapon of choice against duplicate content can be found within "The Robot Exclusion Protocol" which has now been adopted by all the major search engines.

There are two ways to control how the search engine spiders index your site.

1. The Robot Exclusion File or "robots.txt" and

2. The Robots < Meta > Tag

The Robots Exclusion File (Robots.txt)
This is a simple text file that can be created in Notepad. Once created you must upload the file into the root directory of your website e.g. www.yourwebsite.com/robots.txt. Before a search engine spider indexes your website they look for this file which tells them exactly how to index your site's content.

The use of the robots.txt file is most suited to static html sites or for excluding certain files in dynamic sites. If the majority of your site is dynamically created then consider using the Robots Tag.

Creating your robots.txt file

Example 1 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and make the entire site available for indexing. The robots.txt file would look like this:

User-agent: *
Disallow:

Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. By leaving the "Disallow" blank all parts of the site are suitable for indexing.

Example 2 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and to stop the spiders from indexing the faq, cgi-bin the images directories and a specific page called faqs.html contained within the root directory, the robots.txt file would look like this:

User-agent: *
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html

Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. Preventing access to the directories is achieved by naming them, and the specific page is referenced directly. The named files & directories will now not be indexed by any search engine spiders.

Example 3 Scenario
If you wanted to make the .txt file applicable to the Google spider, googlebot and stop it from indexing the faq, cgi-bin, images directories and a specific html page called faqs.html contained within the root directory, the robots.txt file would look like this:

User-agent: googlebot
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html

Explanation

By naming the particular search spider in the "User-agent" you prevent it from indexing the content you specify. Preventing access to the directories is achieved by simply naming them, and the specific page is referenced directly. The named files & directories will not be indexed by Google.

That's all there is to it!

As mentioned earlier the robots.txt file can be difficult to implement in the case of dynamic sites and in this case it's probably necessary to use a combination of the robots.txt and the robots tag.

The Robots Tag
This alternative way of telling the search engines what to do with site content appears in the section of a web page. A simple example would be as follows;

In this example we are telling all search engines not to index the page or to follow any of the links contained within the page.

In this second example I don't want Google to cache the page, because the site contains time sensitive information. This can be achieved simply by adding the "noarchive" directive.

What could be simpler!

Although there are other ways of preventing duplicate content from appearing in the Search Engines this is the simplest to implement and all websites should operate either a robots.txt file and or a Robot tag combination.

Should you require further information about our search engine marketing or optimization services please visit us at http://www.e-prominence.co.uk ? The search marketing company

RELATED ARTICLES


See No Google, Hear No Google, Speak No Google
That's right - I dreamt of a World Wide Web without the Googlopoly. And let me tell you - it was a scary place.
Search Engine Optimization Strategies Guaranteed to Skyrocket your Rankings
The point of optimizing your website is so that you will get ranked higher in the search engines and receive more visitors to your site. As a result, you will increase sales and revenue. Consider the following tips so that you will be able to optimize your site and be returned as a higher result in the search engines.
How To Select The Right Keywords
Keyword Selection
A Play In The Sandbox Is Necessary
There has been a good deal written about the Google 'sandbox' effect, as it's known. It has been taking up a lot of forum and article space over the last few months. I can't help wonder why most of the comment I've been seeing is negative or at least ambivalent about the concept (if of course, it really exists, as is the case with much about SEs that we don't truly know).
The Fundamentals of Inbound Links
We have all heard that adding quality content to your web site will give the search engines a good idea of how to index your web site, it's a topic we covered in "Content, Content, Content" back in November 2004. But the secret to luring the search engines to your web site, and in part to improving your position within those listings is your inbound links.
Give the Folks at Google What They Want
Recent developments on the Google front have web marketers and SEO specialists talking even more than usual. What they're talking about is the changing Search Engine Optimization landscape. Some of the traditional assumptions about what gets good Google ranking have been challenged by things Google has said over the last few months -- especially by the filing of their most recent patent application.
Surviving the Search Wars ? Local Directories
The pursuit of online information has become an increasingly dynamic and competitive marketplace during the past three years. Global heavyweights such as www.google.com, www.yahoo.com, and www.msn.com are backed by massive resources, making it nearly impossible for new companies to even attempt to compete. It would seem for new start directories it is almost impossible to aim for the "catch all" approach, as there are simply bigger companies out there with larger budgets ? who are going to dominate the market for years to come. However, there are still a number of innovative directories evolving which are capable of surviving in this ultra-competitive landscape. The key to this survival is undoubtedly focusing upon a niche and making sure your site stands out from others.
Keyword Research Made Simple!
Keyword Research is the first task in optimizing your web site and pay-per-click campaign. Here you need to know what keywords your target group is using.
SEO For Ecommerce
Ranking well under the free listings in the major search engines basically mean one thing ? Lots of free, recurring, and targeted traffic. Major search engines like Google, Yahoo and MSN can be very powerful weapons in your internet marketing, if you know how to optimize your website to rank highly for your keywords. This article will focus on how you can optimize your website to rank on the top positions in search engines.
How To Design A Search Engine Friendly Website
There are many websites that fail to target their required traffic, even if they've had some search engine optimisation work done. One of the main causes for this is simply because the website isn't search engine friendly. This is a basic essential that needs to be incorporated into the design of all websites at the outset ? think of it as the foundation to establishing your search engine optimisation strategy.
Surviving Googles Aging Delay
Google has always been the search industry's innovator and that's just what Google's aging delay symbolizes, the evolution of search innovation? yet another significant step forward for Google.
Is Google Fair?
If you are the owner of a new website, trying to get a decent ranking from the mighty google, you will no doubt answer with a resounding, NO! Recent findings indicate that Google's algorithm has an ageing filter, which put in simple terms, makes it harder for a new webmaster to get high ranking in the SERP's, in the short term at least. So does this mean google favours established sites over new ones?
Submit Your Website to Search Engines
Some search engine submissions are free and some pay for this benefit but before I get into it I would like to place a little word of warning.
Which SEO Company/Firm to Choose for SEO Services?
In the last 2-3 years many new companies have mushroomed, providing cost effective seo services to their customers. In this whole galaxy of seo companies some are authentic, while others are not. A novice owner of the website has a little knowledge of the technical jargon involved in the SEO, and sometimes can get into trouble.
Search Engine Marketing: Choosing Keyword Phrases
Selecting the right keyword phrases is the key to a successful search engine marketing campaign.
Use Search Engines For A Guaranteed Web Site Promotion
For your web site to succeed, you must use is search engines optimization. Web sites definitely need top rankings in major search engines such as Google, Yahoo!, AOL, and MSN. The higher the ranking, the more likely viewers will come and visit your web site. Your site should stay in one of the top positions in the search engines to draw the largest amount of customers.
Increase Web Site Sales with a SEO Proposal - Part 2
Part I of this article discussed some of the points that should be included in a search engine optimization proposal. Visit the resources section at the bottom of the page if you wish to get a copy of the whole article.
Offpage Optimization: Does Article Marketing Cut the Mustard?
For those who haven't heard: article marketing is the new offpage optimization strategy that works like magic and won't cost you a dime. What's the strategy? You create an arsenal of short, well-written keyword articles aimed at your target customer and include your URL at the bottom.
What is Your Sales Message?
So you built a website, you have a product, but it's not selling. What's your sales message? Does your site scream out to potential customers "BUY ME!!" or does it simply say "Hi, we're here, we have a product, if you want one...well you could buy it here...or go...YUCK!
Internet Directory Submission Tips
Internet Directories and their Importance