Sitemap Generator Software Reviews 2007

(Listed in order of ranking)

  1. Xtreeme SiteXpert
  2. Sitemap XML Software
  3. A1 Sitemap Software
  4. XML-Sitemaps

Further Information

Sitemaps Protocol

 

The sitemaps protocol is a formal way of preparing XML sitemaps. It is a standard that is constantly updated (current version 0.90). All of the most important search engines have collaborated on the standard and use it as an adjunct to there normal web-indexing efforts.

The XML components of a sitemap allow webmasters to to list all of the URL's that they want crawled as well as other metadata such as, the last time the webpage was changed etc.

SAMPLE XML SITEMAP

The following example shows a Sitemap that contains just one URL and uses all optional tags. The optional tags are in italics.

			<?xml version="1.0" encoding="UTF-8"?>
			<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
			
			   <url>
			      <loc>http://www.sitemapsoftwareeview.com.com/</loc>
			      <lastmod>2007-30-06</lastmod>
			      <changefreq>weekly</changefreq>
			      <priority>0.7</priority>
			
			   </url>
			</urlset> 
			

XML TAG DEFINIIONS:

Attribute Description
<urlset> required

Encapsulates the file and references the current protocol standard.

<url> required

Parent tag for each URL entry. The remaining tags are children of this tag.

<loc> required

URL of the page. This URL must begin with the protocol (such as http) and end with a trailing slash, if your web server requires it. This value must be less than 2,048 characters.

<lastmod> optional

The date of last modification of the file. This date should be in W3C Datetime format. This format allows you to omit the time portion, if desired, and use YYYY-MM-DD.

Note that this tag is separate from the If-Modified-Since (304) header the server can return, and search engines may use the information from both sources differently.

<changefreq> optional

How frequently the page is likely to change. This value provides general information to search engines and may not correlate exactly to how often they crawl the page. Valid values are:

  • always
  • hourly
  • daily
  • weekly
  • monthly
  • yearly
  • never

The value "always" should be used to describe documents that change each time they are accessed. The value "never" should be used to describe archived URLs.

Please note that the value of this tag is considered a hint and not a command. Even though search engine crawlers may consider this information when making decisions, they may crawl pages marked "hourly" less frequently than that, and they may crawl pages marked "yearly" more frequently than that. Crawlers may periodically crawl pages marked "never" so that they can handle unexpected changes to those pages.

<priority> optional

The priority of this URL relative to other URLs on your site. Valid values range from 0.0 to 1.0. This value does not affect how your pages are compared to pages on other sites—it only lets the search engines know which pages you deem most important for the crawlers.

The default priority of a page is 0.5.

Please note that the priority you assign to a page is not likely to influence the position of your URLs in a search engine's result pages. Search engines may use this information when selecting between URLs on the same site, so you can use this tag to increase the likelihood that your most important pages are present in a search index.

Also, please note that assigning a high priority to all of the URLs on your site is not likely to help you. Since the priority is relative, it is only used to select between URLs on your site.

SITEMAP FILE LOCATION :

It's very important that you understand how the location of a sitemap can influence the URL's that can be included in that sitemap. Any file or folder underneath the folder in which the sitemap.xml file is located can be included amongst the sitemap URL's. Any file or folder referenced that is above the folder that the sitemap.xml file is in will be dropped from the indexing process.

For this reason it is usually normal to put your sitemap file at the root of your server. If you aren't sure which folder is the root of your server then please clarify with your webhost.

XML SITEMAP VALIDATION :

XML is a standard, and the XML Sitemap protocol has a standard schema (template), that can be validated just like HTML or Javascript. An online XML validator is available at the W3C website

Please keep in mind that all of the Sitemap Software that we have reviewed outputs protocol compliant XML Sitemaps.

EXCLUDING CONTENT :

The Sitemaps protocol doesn't have a facility that will allow you to exclude content from the search engine robot, to correctly indicate content you don't want indexed then you will need to specify the file of folder in the robots.txt file.

More information on robots.txt file is available at robotstxt.org