The issue here is that crawlers will generally follow every link on your pages unless you tell them not to. There are many cases in which you won’t want them to do this. Consider a calendar script that records a schedule of events for your site for example. Most calendars operate on calculations that determine where the dates and days of the weeks will fall – and users can, feasibly, click infinitely into the past or future. Now imagine this calendar in the hands of a Search Engine crawler. The crawler doesn’t pass judgment the way a user does. A crawler can end up following your calendar into the infinite future.
Of course the crawler at some point will stop – it will determine that it has fallen into an infinite loop and cease crawling your site. So what’s the problem? Infinite loops can cause crawlers to leave your site. They can also cause them not to index the important content. If they fall into an infinite loop before they index your main content, guess what – your content doesn’t get indexed.

Why wouldn’t you want search engines spiders or robots to search every nook & cranny of your website? Surely the more content it crawls, the more information it has to push your website further up the search results?

Not when comes to pages with calendar scripts. The issue here is that crawlers will generally try to follow every link on your pages unless you tell them not to. Consider the calendar script that allows people to chose a date for an upcoming holiday. Most calendars operate on calculations that determine where the dates and days of the weeks will fall & users can, feasibly, click infinitely into the past or future. Now imagine this calendar being looked at by a Search Engine crawler. The crawler doesn’t think the way a user does. A crawler will just follow the link to the next page of information, in theory it can end up following your calendar into the infinite future.

Of course the crawler at some point will stop. It will know that it has fallen into an infinite loop and cease crawling your site. So what’s the problem? It will just move onto the next part of the page. No, infinite loops can cause crawlers to leave your site. They can also cause them not to index the important content. If they fall into an infinite loop before they index your main content, chances are your content doesn’t get indexed.

What can you do to stop this?…………………robots.txt

In our next blog we will be explaining why you need a robots.txt file, & how you can use it to stop the above happening.

http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/digg_48.png http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/reddit_48.png http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/stumbleupon_48.png http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/delicious_48.png http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/blogmarks_48.png http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/technorati_48.png http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/google_48.png http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/myspace_48.png http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/facebook_48.png http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/yahoobuzz_48.png http://seofrance.co.uk/blog/wp-content/plugins/sociofluid/images/twitter_48.png