Put Your Site on the Map with Google Sitemaps - Namespaces and Elements
(Page 3 of 4 )
The namespace (xmlns) is a unique resource in the format of a URL that states the structure you are using when you create your Sitemap file. The next element is the <url> element and acts as a container element for other areas of your site. The child elements of the <url> element provide additional information about your pages. The first child element of the <url> element is the <loc> or location element, which defines each page with a unique identifier, namely its URL:
<loc>http://www.yourdomain.com/</loc>
The data within the <loc> element must start with the protocol in use (HTTP in this case) and must end in a trailing slash if an individual page isn't specified. You could specify your root directory, sub directories, or individual pages. So you could also use something like this:
<loc>http://www.yourdomain.com/todaysnews.htm</loc>
Dynamically generated URLs can also be used but any entity characters (& < > ' and ") must be escaped correctly. The maximum size of any data in the <url> is 2048 characters which should be more than enough for most dynamic URLs. The following URL would be considered valid:
<loc>http://www.yourdomain.com/todaysfavourites?
category=fun&type=pics</loc>
As you can see, the & character has been escaped using & other escape codes are ' for ', " for ", > for > and < for <.
The <loc> element is the only required element in any <url> element, but the optional elements, of which any or all can be used, are as follows:
<lastmod>2006-12-06</lastmod>
<changefreq>daily</changefreq>
<priority>0.9</priority>
The <lastmod> date must be in the W3C Datetime format and can include the time if desired. Valid date and time fragments for the Datetime format are:
Year - YYYY
Year and Month - YYYY-MM
Complete Date - YYYY-MM-DD
Complete Date, Hours and minutes - YYYY-MM-DDTHH:MMTZD
You can also include seconds and fractions of seconds if necessary. The date and time are separated by a literal T and the TZD stands for Time Zone Difference, which is the hours and minutes plus or minus from GMT. A full date and time could be:
2006-12-06T18:00+00:00
The <changefreq> element can be any of the following values: always, hourly, daily, weekly, monthly, yearly or never. This value is just a guide to Google spiders. If you set the <changefreq> of every page to hourly, this doesn't mean that a spider will be sent hourly to crawl your site.
The default priority, if this element is not specified, is 0.5. It can be any value between 0.0 and 1.0. This element is really only necessary on very large websites that visiting crawlers may not have time to index in full. The <priority> element is relative only to URLs in your domain, so marking all of your URLs with a priority of 1.0 means only that each page in your domain is of equal value, not that your URLs are more important than URLs in someone else's Google Sitemap with a priority of 0.6. The pages with the highest priority in your domain will be indexed before pages with a lower priority.
Next: Completing and Uploading Your Sitemap >>
More Website Submission Articles
More By Dan Wellman