Search Engine Robots - How They Work, What They Do (Part I)


Automated search engine robots, sometimes called "spiders" or "crawlers", are the seekers of web pages. How do they work? What is it they really do? Why are they important?

You'd think with all the fuss about indexing web pages to add to search engine databases, that robots would be great and powerful beings. Wrong. Search engine robots have only basic functionality like that of early browsers in terms of what they can understand in a web page. Like early browsers, robots just can't do certain things. Robots don't understand frames, Flash movies, images or JavaScript. They can't enter password protected areas and they can't click all those buttons you have on your website. They can be stopped cold while indexing a dynamically generated URL and slowed to a stop with JavaScript navigation. How Do Search Engine Robots Work?

Think of search engine robots as automated data retrieval programs, traveling the web to find information and links.

When you submit a web page to a search engine at the "Submit a URL" page, the new URL is added to the robot's queue of websites to visit on its next foray out onto the web. Even if you don't directly submit a page, many robots will find your site because of links from other sites that point back to yours. This is one of the reasons why it is important to build your link popularity and to get links from other topical sites back to yours.

When arriving at your website, the automated robots first check to see if you have a robots.txt file. This file is used to tell robots which areas of your site are off-limits to them. Typically these may be directories containing only binaries or other files the robot doesn't need to concern itself with.

Robots collect links from each page they visit, and later follow those links through to other pages. In this way, they essentially follow the links from one page to another. The entire World Wide Web is made up of links, the original idea being that you could follow links from one place to another. This is how robots get around.

The "smarts" about indexing pages online comes from the search engine engineers, who devise the methods used to evaluate the information the search engine robots retrieve. When introduced into the search engine database, the information is available for searchers querying the search engine. When a search engine user enters their query into the search engine, there are a number of quick calculations done to make sure that the search engine presents just the right set of results to give their visitor the most relevant response to their query.

You can see which pages on your site the search engine robots have visited by looking at your server logs or the results from your log statistics program. Identifying the robots will show you when they visited your website, which pages they visited and how often they visit. Some robots are readily identifiable by their user agent names, like Google's "Googlebot"; others are bit more obscure, like Inktomi's "Slurp". Still other robots may be listed in your logs that you cannot readily identify; some of them may even appear to be human-powered browsers.

Along with identifying individual robots and counting the number of their visits, the statistics can also show you aggressive bandwidth-grabbing robots or robots you may not want visiting your website. In the resources section of the end of this article, you will find sites that list names and IP addresses of search engine robots to help you identify them. How Do They Read The Pages On Your Website?

When the search engine robot visits your page, it looks at the visible text on the page, the content of the various tags in your page's source code (title tag, meta tags, etc.), and the hyperlinks on your page. From the words and the links that the robot finds, the search engine decides what your page is about. There are many factors used to figure out what "matters" and each search engine has its own algorithm in order to evaluate and process the information. Depending on how the robot is set up through the search engine, the information is indexed and then delivered to the search engine's database.

The information delivered to the databases then becomes part of the search engine and directory ranking process. When the search engine visitor submits their query, the search engine digs through its database to give the final listing that is displayed on the results page.

The search engine databases update at varying times. Once you are in the search engine databases, the robots keep visiting you periodically, to pick up any changes to your pages, and to make sure they have the latest info. The number of times you are visited depends on how the search engine sets up its visits, which can vary per search engine.

Sometimes visiting robots are unable to access the website they are visiting. If your site is down, or you are experiencing huge amounts of traffic, the robot may not be able to access your site. When this happens, the website may not be re-indexed, depending on the frequency of the robot visits to your website. In most cases, robots that cannot access your pages will try again later, hoping that your site will be accessible then.

Resources

*SpiderSpotting - Search Engine Watch http://searchenginewatch.com/webmasters/spiders.html

*Robotstxt.org List of robots and protocols for setting up a robots.txt file. http://www.robotstxt.org/

*Spider-Food Tutorials, forums and articles about Search Engine spiders and Search Engine Marketing. http://spider-food.net/

*Spiderhunter.com Articles and resources about tracking Search Engine spiders. http://www.spiderhunter.com/

*Sim Spider Search Engine Robot Simulator Search Engine World has a spider that simulates what the Search Engine robots read from your website. http://www.searchengineworld.com/cgi-bin/sim_spider.cgi

Daria Goetsch is the founder and Search Engine Marketing Consultant for Search Innovation Marketing, a Search Engine Optimization company serving small businesses. She has specialized in Search Engine Promotion since 1998, including three years as the Search Engine Specialist for O'Reilly Media, Inc., a technical book publishing company.

Copyright © 2002-2005 Search Innovation Marketing. http://www.searchinnovation.com All Rights Reserved.

Permission to reprint this article is granted if the article is reproduced in its entirety, without editing, including the bio information. Please include a hyperlink to http://www.searchinnovation.com when using this article in newsletters or online.

This Site Is For Sale

Related Articles:

One Way Links and Reciprocal Link Exchange and Traffic
While reciprocal links are still valid and help you gain link popularity and page rank, many SEO experts agree that one way links are more valuable. One way links are also known as non-reciprocal links. Acquiring one way links are much more difficult than reciprocal links. One way links are a tool that can be quite beneficial to the webmaster. The very best one way links are those that are included in the content of another website, directing visitors to your website. One way links are those where you point to a site, or a site points to you without a link being returned. One way links are the best way to increase the link popularity of the site and get theme based links for natural search engine optimization.

Using Back Links to Get Top Search Engine Ranking
There are no hidden secrets on how to rank high with the major search engines. All that is needed is a basic understanding of how search engines work and a bit of know how.

Google Grants Links to SEMcares.com to Help Non-Profits with Volunteer Search Engine Marketing
The Official Google Grants Blog tells non-profits about SEMcares.com, the not-for-profit database that connects non-profits and volunteer search engine marketing providers.

Build Links, Increase Page Rank, Increase Traffic
Search Engines in the last couple of years are giving more weight to one way links with a similar theme, these links are a vote of trust and confidence for your website, they are so important that they help your site in the rankings of search engines. One search engine in particular uses link popularity, that search engine is Google. When you improve your link popularity it will eventually move your site up in the serps, this is the goals of every webmaster.

SEO Link Building and Copywriting Service Relaunched by Search Engine Optimization Firm Brick Marketing
Brick Marketing has realized the importance of incorporating and combining link building into all aspects of online marketing for each of their clients. They have since re-launched their link building service for those interested in introducing online marketing in their business model.

Link Building: To Link, or Not to Link, That is the Question
Lately, there have been a lot of heated discussions regarding link building. Is it ethical to create a link building campaign? Does Google or any other search engine penalize for "link farms" (a bunch of non-related links created for the SOLE purpose of increasing search engine ratings)? Is the "link building era" over?Link FarmsMany webmasters claim that Google penalizes websites for link farms.

Is Exchanging Links Better Than One Way Links
When establishing links and exchanging links this helps your rankings with the search engines and builds on connecting with other business owners. When exchanging links with other webmasters you will need to give them your code and you will need to use their code on your site.

Link Survey Version 1.6: Improve Search Engine Ranking by Learning About Competitors
AntsSoft today announced the release of Link Survey version 1.6, the first software in the world which can check link popularity of multiple relative websites, make comprehensive analysis, and generate a detailed report.

Smart Link's Local Submit Enhances SEO for Vertical Search Engines
Smart Link Web, a Michigan based (http://profiles.smartlinksolutions.com) leader in search engine optimization (SEO), now offers a method for small and local businesses to climb to the top of search engine rankings. This is in response to Google's recent change in its search results through the vertical search system. It introduces Smart Link Local Submit to give local and small scale businesses an edge in the field of online business. Unlike the traditional horizontal counterpart, vertical search results place local businesses above the normal organic results. Vertical searches are focused on the particular - and the online user is given ...

Buying Links - How To Make Sure That The Links You Buy Are Worth It
Before you start looking at links to buy you need to know that not all links for sale are worth it There are many things that you need to look at before you buy those links

Rock Your Rank With a Dynamite Text Link - Yahoo Directory Explodes Rankings
Last week a client called me excitedly exclaiming that their Google PageRank had jumped a notch and their targeted keyword term now ranked #23 (up from #45) for their competitive search phrase. I asked the client if he'd been notified by Yahoo that his site was now included in the index after we had submitted it three weeks ago.

Link Building and Link Strategy for Increased Web Traffic
Toronto, ON November 26, 2007 ? There are millions of websites in cyberspace. The challenge becomes how to ensure that your website is found on search engines and is seen by potential customers.

Traffic One Way Links And Reciprocal Link Exchange
While reciprocal links are still valid and help you gain link popularity and page rank, many SEO experts agree that one way links are more valuable. One way links are also known as non-reciprocal links. Acquiring one way links are much more difficult than reciprocal links. One way links are a tool that can be quite beneficial to the webmaster. The very best one way links are those that are included in the content of another website, directing visitors to your website. One way links are those where you point to a site, or a site points to you without a link being returned. One way links are the best way to increase the link popularity of the site and get theme based links for natural search engine optimization.

25 Common Link Exchange & Search Engine Terms
In today's world of website promotion and traffic building, a whole new set of terms and definitions have developed. To be a successful webmaster and/or website owner, it is important to know the meanings of some of the most popular link exchange and search engine terms.

Increase Your Search Engine Ranking With Non-Reciprocal Link Building
You don?t need to be taught over and over that one of the most effective and powerful ways of obtaining the best position in search engines is through link building Because of this, there are already a lot of webmasters that would either send you an e-mail or drop you a call, requesting for a link exchange with you


Privacy Policy | Copyright/Trademark Notification
eXTReMe Tracker