Webmaster Papers








Search Bots, Crawlers, and Spiders


If you are a webmaster and you review your logs, often you will see a bunch of really strange hits. They aren't humans, you can't tell their operating system or their browser! Who are these pesky little creatures who rummage around the internet all the time?

Not quite sure what I am talking about? Here is a few examples of various bots searching my website:

207.68.146.40 (msnbot.msn.com)
msnbot/1.0 (+http://search.msn.com/msnbot.htm)
This is the MSN Search bot.

207.68.146.40 (lj2070.inktomisearch.com)
Mozilla/5.0 (compatible; Yahoo! Slurp;
http://help.yahoo.com/help/us/ysearch/slurp)
This is Yahoos Search Bot.

66.249.65.147 (crawl-66-249-65-147.googlebot.com)
Mediapartners-Google/2.1
This is Googles bot, that searches your webpages for AdSense.

What is a Bot, Crawler, Spider?
These terms are all the same, they all refer to an automated program that goes from website to website caching and processing the pages for search engines. As you know, "WWW" means World Wide Web, thus "Spider" seemed like an appropriate term. Crawler is another term that just describes what it does, crawling from site to site and page to page endlessly. Bot, is actually short for "robot" and again is just an automated program to index websites.

What is the purpose of a Spider?
A spider looks at all the pages of your website, and uses that information to rank you in search engines (how high you will list in a search result), and cache a copy of your page on their server for quick reference, and if your site ever goes down. Spiders jump from link to link on the Internet and run endlessly, even if you never submit your website to a search engine, odds are your site will still be spidered.

Can I stop bots and spiders from searching my website?
Yes and no. Legitimate spiders are run by reputable organizations that follow certain rules. For instance, most companies have a policy that their robot will search for a file called "robots.txt" in the root of your website. This text file is filled with information telling the bots what and what not is allowed to be viewed. Unfortunately, there are also bad bots out there, they search the internet harvesting e-mail addresses for spam and other bad things, these bots often don't comply with the "robots.txt" standard.

How many bots are there?
It's impossible to guess how many bots are out there searching websites. On any given day I will get roughly 10 different ones check my website. Some of them only search one or two pages, others go over my entire website. Not all of them give you a good description of what they do, or who owns them. If you cut and paste their name and IP address in to Google, quite often you can find more information about what they do.

How can I get my site spidered?
As I mentioned before, if your website is up long enough, it "will" get spidered eventually. However, if you want to ensure that it gets done within a few months, go to the various search engine websites and look for the "Add URL" or "Suggest a Link" pages. DMOZ is one of the big directories which you should submit your site. When you sign up for these search engines, your website is automatically queued up to be spidered. It may take several weeks or months to actually start showing up on the search engine, even after you see the robot spidering your website.

What about pay search engines?
There are a bunch of different search engines that make you pay to have your website listed. I personally don't support these search engines, I find that most people use the big free search engines anyway. However, if you do wish to get included in some search engines faster, many have payment options which will get your site listed within a couple of days.

Ken Dennis
http://KenDennis-RSS.homeip.net/

RELATED ARTICLES


Keyword Research Made Simple!
Keyword Research is the first task in optimizing your web site and pay-per-click campaign. Here you need to know what keywords your target group is using.
SEO - Google Sitemaps Explained
Once again I seem to be writing about Google. The reason Google keeps cropping up in these articles is that:
Get a Number One Google Ranking With This Simple Technique
You probably do this already - complete regular searches in Google for your key phrases and see how high you rank. It's well known that the first three results are far and away the sites that get the most clicks. If you can get one of the top three results in your key terms then you will have more targeted visitors coming to your site. If you can get the first result, well that is even better. Of course all your competitors want to do the same.
Why SEO Will Make or Break You, Part 1
Today's article is about the wonders of SEO. SEO is short for Search Engine Optimization. If you know anything about our world wide web, than you surely know that the sites bringing in the most monstrous traffic are search engines. In the world of traffic, search engines control almost all of the pieces. If you stop and think about it, if you want to find something on the web, what do you do? Odds are you hop on over to Yahoo, Google, MSN, Alta Vista, Ask Jeeves, Kanoodle, and the list just goes on and on. Almost every method of obtaining traffic, other than offline sources and direct, random domain visits, all of your traffic is coming through these engines. Pay Per Click advertising, sponsered search, directory submissions, all these play off the search engine.
Possibly The Biggest Misconception About Ranking Well In The Search Engines
Onpage search engine optimization are things that you can change ON your webpage.
Googlebot Wont Go Home
I have 'Googlebot' crawl my site every day like a dispossessed spirit that can't leave.
Part I : Getting Free Hits Using These Simple Tips & Tricks
Search Engine Optimization
Google Bring Deskbar Search To Windows Desktop. Now Any Website Can Take Advantage Of This
Google's premier of desktop search proves that the desktop is an extremely valuable marketing real estate. Google, which holds about 75% of the Internet search market, just introduced "Deskbar" ? a small desktop application that allows users to search Google directly from their desktops. Google currently rules the Internet, but positioning themselves on the desktop gives them the power to rule not only the Internet, but also the entire personal computer.
Do It Yourself SEO
Internet surfers use search engines more than any other tool to find things online. Search engines rank their results using a complex formula that considers web page content, link popularity and other details. This is why you should Search Engine Optimization (SEO) your web site.
Drive More Traffic to Your Website With Your Web Page Title!
One of the most overlooked, but important components, on your web page is the Meta title. The Meta title is the text or page title found at the top left of your browser window and it is also the title saved when a web site visitor bookmarks your website. Some people will argue that search engines put little or no value on the title tag, but regardless of how the Meta title figures into the ranking algorithm, the title tag is still critical to driving new and repeat traffic to your website.
Are You A Google Junkie
Google this, Google that, Google Google Google.......
Search Engine Optimization (SEO) - Fix Your OnPage!
Search Engine Optimization (SEO) is something you should be aware of before creating a site. Make sure you've done careful researches on the best keywords to use. Using the wrong keywords would eliminate your site from search engines forever!
Buzzwords vs Effective SEO Keywords
Ever see a website that seems to speak a foreign language...in English? We encounter many SEO client websites that rely on buzzwords in the page copy to get the word out about their product. The problem lies with visitors who may not be familiar with those terms. This means optimizing with buzzwords may not be the best way to gain traffic. If your prospective visitors are not searching for those terms, how do they find your website?
THREE Secrets to High Search Engine Rankings the Easy Way!
Let me guess, you've read article after article on how to FINALLY get your site ranked at the top of the search engines. In fact, your mind is probably spinning with info and now you are just flat-out confused.
SEO #3: Getting Listed In Google in Under 24-Hours!
Yesterday you should have read the second course out of 6 courses that will help you get a TOP rank in the search engines and get EXPLOSIVE LASER TARGETED TRAFFIC for Free. Today we move on to course #3 and reveal how to Get Listed In Google in Under 24-Hours! Today is a short course but it's one that you must have been waiting for. This information sells for $100+ elsewhere but its just info and I'm willing to tell you for free. Now let's get started.
Is Google Having a Tough Time with Their Website Limit?
If you are one to pay attention to what happens within the Google realm, you might find yourself thrown for a loop these days. As Google updates their results, it seems like they are having some issues dealing with so many new websites popping up.
How Ive Maintained 7 Top Ten Google Rankings For Nine Months
Back in November 2004 I discovered a way to get a top 10 ranking in Google. I tested the technique for 3 months before I shared my findings with the world.
Meta Tags - What Are They and Which Search Engines Use Them?
Defining Meta Tags is much easier than explaining how they are used, and by which engines. The reason is very few engines clearly lay out what they do and do not look at, and how much emphasis they put on any one factor. So, we'll start with the easy part
Have You Heard Of Website Optimization
Have you heard of website optimization ? If you are building a website or have a new one and you have never heard of it, then you had better take note, because if you want to climb the ladder on the search engines it is something you need to learn about. Website optimization means optimising your website so the search engine spiders will love it.
Answers Count - Matching Keyword and Phrase Density
What's more important to your business success - the question or the answer? Certainly, you want your questions to reflect what you are trying to find out. Obviously, your questions should be easily understood. Most definitely, you're hoping for some positive responses. But, what you really need to do is count repetitive keywords and phrases found in your respondents answers.