signalkraft

Sitemaps for CodeIgniter

Google, Yahoo, Bing & Others

Download Sitemaps for Codeigniter

Version 0.7

Licensed under GNU GPL v2

This library for the popular PHP framework CodeIgniter generates XML sitemaps and informs search engines and other webservices of new content for them to crawl.

With live search around the corner and a relatively new, yet widely adopted and open standard for Sitemaps, now is the time to create a Sitemap for your CodeIgniter application!

Important Features

  • Easy to use: See the example to get started,
  • Tiny footprint: Only library and config are necessary,
  • Scalability: Split work load and update only when necessary (see a word about performance),
  • Open standard: Follows the Sitemaps XML protocol 0.9, compatible with any modern search engine,
  • Google, Yahoo, Bing & Ask.com: By default this library uses Sitemap Writer to spread your sitemap (this can be changed in the config),
  • Sitemap Indexes: Build an index of all your sitemaps (see Sitemap Index),
  • Autodiscovery: Web services will find your sitemap through robots.txt (see Autodiscovery).

This library is based on work of Svetoslav Marinov but performs quicker (objects were tossed in favor of arrays), supports a newer, non-proprietary standard and in my opinion fits the CodeIgniter philosophy much better (library versus plugin).

Installation

Download the library, extract the contents to your system/application folder. Modify the config file to suit your particular application then load and use the library as you would with any other. See the example below for a sample controller.

Example

In this example we will assume you have a blog you wish to create a sitemap for. Your blog has a model “posts_model” that contains all the articles you have written.

application/controllers/sitemap.php
<?php

class Sitemap extends Controller
{
	function Sitemap()
	{
		parent::Controller();
	}
	
	function index()
	{
		$this->load->model('posts_model');
		$this->load->library('sitemaps');
		
		$posts = $this->posts_model->get_posts();
		
		foreach($posts AS $post)
		{
			$item = array(
				"loc" => site_url("blog/" . $post->slug),
				// ISO 8601 format - date("c") requires PHP5
				"lastmod" => date("c", strtotime($post->last_modified)),
				"changefreq" => "hourly",
				"priority" => "0.8"
			);
			
			$this->sitemaps->add_item($item);
		}
		
		// file name may change due to compression
		$file_name = $this->sitemaps->build("sitemap_blog.xml");

		$reponses = $this->sitemaps->ping(site_url($file_name));
		
		// Debug by printing out the requests and status code responses
		// print_r($reponses);

		redirect(site_url($file_name));
	}
}

Call the controller and you should be redirected to your freshly built sitemap.

If you want to see HTTP response codes and the requests that were sent by the ping function, uncomment the print_r line in the example.

Of course you will want to update your sitemap frequently – on *nix operating systems a cronjob stands to reason. To update every ten minutes, use crontab -e and add this line:

crontab -e
# m h  dom mon dow   command
*/10  *  *  *  *  /usr/bin/wget -q -O /dev/null http://www.example.com/sitemap

With large sites and frequent changes, consider generating only when creating, updating or deleting. In this case arranging your data into several sitemaps can further reduce work load. For example a sitemap with all your blog posts can be updated seperately from a sitemap containing your infrequently modified static pages.

To stitch all those sitemaps together into something the search engines can handle you will need another type of file, the sitemap index:

Sitemap Index

Sitemap indexes are technically only needed if your sitemap exceeds 50,000 URLs or 10MB uncompressed file size – whichever comes first. In this case you need to build several smaller sitemaps and index their location in a seperate file, the sitemap index. This file is then treated as a normal sitemap:

$sitemaps = array(
	array("loc" => site_url("sitemap_posts.xml.gz"), "lastmod" => date("c")),
	array("loc" => site_url("sitemap_pages.xml.gz"))
);

$index_file_name = $this->sitemaps->build_index($sitemaps, "sitemap_index.xml");
$reponses = $this->sitemaps->ping(site_url($index_file_name));

redirect(site_url($index_file_name));

For the sake of completeness I should mention that your sitemap index musn't exceed 50,000 sitemaps or 10MB uncompressed file size. If you manage to hit this 2.5 * 109 URL limitation on CodeIgniter you should probably pause for a minute and contemplate just what got you into this position and how you may avoid it in the future.

Autodiscovery

Autodiscovery is a neat feature by which other web services can find your sitemap. You just need to create a robots.txt file in the root directory of your website (http://www.example.com/robots.txt) containing the following line:

robots.txt
Sitemap: http://www.example.com/sitemap.xml

If you already have a robots.txt, put this directive anywhere; it is independent of user-agent.

For further information visit the Sitemaps XML Protocol website or look at the source code of this library. All functions and parameters are documented.

Changelog

  • Version 0.7:
    • Fixed a bug in add_item_array()
    • Improved documentation
  • Version 0.6:
    • Added sitemap_index function
  • Version 0.5:
    • First public release