Thursday, April 9, 2009

Search Engine Spider Behaviour :

Search engines use an online software to browse the web, looking for new sites to index. The software is referred to as a spider or crawler or the worm. Most search engine spiders have the same level of technology as an early version of Search engines.

What does that mean to us? It means that the spiders can't perform JavaScript functions,images frames and generally can't spider image maps. Spiders can't see text that is contained in graphic images, although some of them read and use the Alt tag assigned to the image. If you think about how the navigation is setup on your site, do you have rollover images in your navigation or drop down lists? Search engines may not see those interior pages. They require text links to get into the deeper content your site.

Have you ever noticed how some designers always have text links at the bottoms of their pages? One good reason to do that is so that your site is in compliance with the American Disability regulations since text readers also can't perform Javascript functions. But the best reason is to get the spiders into the site. 508 compliance has become an important aspect of Web design, and goes hand in hand with search engine optimization.

The text on the pages should be HTML text. Text that is contained in images can't be seen by the spiders, so the content won't be reported back to the search engine.

The goal is to have the search engine spider send back information on every page of your site.

Spiders also browse through the directory structure of your site. It might not be a good idea to give them access to certain areas such as your CGI bin directory. Particular meta tags give instructions to search engine spiders but they are mostly ignored. The best way to keep spiders out of things you don't want them in is to use a Robots.txt file.The Robot.txt file includes those file name which we want to hide from the search engine. This file is kept on the server level, and is the first thing that search engine spiders look for when they access your site. Take a look at your Web site statistic program. If you don't have a Robots.txt file, take a look at your error section and see if it is listed as a "file not found". If you have one, it should be listed under your accessed files section. One of the best way to keep your site search engine friendly is by making the use of XML site map.