One of the most often asked questions I see on webmaster forums is about Google Sitemaps. In short, a Google Sitemap is a mechanism that allows Google to index your site with more accuracy. If you do not implement a Google Sitemap, Google will still index your site. So why should you setup a Google Sitemap?
Managing Google’s Bot
The main dilema is that you want Google’s bots to visit your site and index as much as possible, but after that you only need Google to index new content and not just waste bandwidth indexing previous content. The robots.txt file offers only limited options, it can’t tell bots to index only new content, so Google came up with Sitemaps to help webmasters make bots work more efficiently. Google’s Sitemap format makes it easy to tell the search engine bot, what your urls are, when they were last changed, and the frequency with which they change. This way the bot knows what it needs to look at and what it can ignore. This should help reduce crawling and make for more accurate indexing as well.
For websites that use non-engine friendly urls or sites that are brand new and have put a lot of content, the benefits of using a sitemap is that indexing will be more accurate. If your site is new and has no pagerank, a sitemap will not add your site any quicker than if you did not use a sitemap. The idea is not immediate search engine rankings, the idea is to have correct indexing, so when Google does add your site officially to its databases, all of your pages will be included correctly.
If you use WordPress, Google Sitemap Generator is a WordPress plugin that can create and manage your Google Sitemap. You can also create your own manually, just reference the Google’s Sitemap protocol.
Forget Web 2.0, the real talk on the Internet right now has to do with Google’s massive datacenter changes, nicknamed by almost everyone as “Big Daddy”. Instead of just a normal pagerank update that everyone can see in their Google Toolbar, Big Daddy is essentially an update of everything Google, according to most industry insiders.
Google is testing a new data center infrastructure, a feat much bigger and comprehensive than an algorithm change. Dubbed â€œBig Daddyâ€ both in the search marketing blogs and forums and by the friendly folks at Google, this new data centerâ€”still in shakedown modeâ€”will reportedly add new ground-level capabilities into the Google search function and drive those powers deep into all the algorithms with which Google searches, studies and indexes the Web.
The more apparant thing that most webmasters will notice as Big Daddy rolls out is Google’s new spider. Looking over your logs in February you might notice this:
Agent:Mozilla/5.0 (compatible;Googlebot/2.1; +http://www.google.com/bot.html)
Google’s new bot is based on Mozilla code and so all that work you did on making your site more open standards friendly should pay off, especially if you check your site with Firefox or any other Mozilla based browser.
For a while now webmasters have been noticing the aggressiveness of Google’s new search bot, and in case you decided to discourage Googlebot 2.1, you probably should think about reversing this action quickly. The notion is that Google now uses this new bot first to spider sites. If you send Googlebot 2.1 away, Google will stop sending any other bots to your site period!
All this means that Google is stepping up and upgrading their datacenters to meet the challenges of newer competitors like MSN and Yahoo.
Other than Google Analytics, which came out this week, Google also made some changes to Google Sitemaps. The new features amount to essentially some feedback as to what are the most popular search terms for you site, and what if any errors some of the sitemap urls are generating.
In order to enable the new options, you must verify your site by creating a specifically named html file on your site. The new features are helpful, but may be considered very insignificant when compared to Google Analytics.