ProcessWire Sitemap XML

Turn-key ProcessWire module for easily configuring and rendering sitemap.xml output.

  • Support for up to 50,000 page URLs.
  • Creates a turn-key, ready-to-use yourdomain.com/sitemap.xml just by installing the module.
  • Custom configurable changefreq and priority on a per-template basis.
  • Option for automatically generated changefreq (change frequency) based on page modification times.
  • Add pages to the site map (or remove them) on a per-page basis from the module configuration.
  • Built-in cache option, ensuring you don't have to re-render the sitemap on every pageview.
  • Ability to add custom URL segments on a per-template or per-page basis (requires using the API).
  • Supports multi-language sitemaps (hreflang).

Get Sitemap XML Now

About WireSitemapXML

Upon installation, this module creates a template named sitemap-xml and a ready-to-use page at the URL /sitemap.xml. You can configure all the details in the module settings. This module also has a full API.

By default, hidden pages are excluded from the sitemap. In the module configuration, you can choose to add them back to the sitemap, or just their children, if you prefer. You can also choose to exclude specific pages as needed, even if not hidden.

On a multi-language site that uses ProcessWire's core LanguageSupportPageNames module, you can optionally configure your sitemap to support the URLs in each language. This is handled in the module configuration by specifying the appropriate hreflang language code for each of your languages. Though note that if already using <link rel="alternate" hreflang='…' />tags in in your HTML page output, then it is redundant to duplicate that in the sitemap.

Installation

  • Copy all of the files for this module into a new directory named /site/modules/WireSitemapXML/.
  • In your ProcessWire admin, go to: Modules > Refresh.
  • Locate the "Site map XML" module on the "Site" tab of your Modules and click "Install".
  • Configure the module as indicated, or plan to come back to it after testing the default sitemap.xml output.

Please note that if your /site/templates/ directory is not writable, this module will ask you to copy the file /site/modules/WireSitemapXML/sitemap-xml.php to /site/templates/sitemap-xml.php.

After installation, preview the sitemap.xml at yourdomain.com/sitemap.xml (or yourdomain.com/sitemap.xml?debug=1 to view in plain text) and review it in detail. Make note of any URLs that you want to exclude, as well as any that are missing. Use the settings in the module configuration to customize your sitemap output until you have it how you want it.

Sitemap debug mode

In some cases it may be easier to preview your sitemap.xml if it is outputting as the text/plain content-type rather than XML. To do this, append ?debug=1 to the URL, i.e. domain.com/sitemap.xml?debug=1

Please note that your site must be in debug mode — $config->debug=true; in /site/config.php, OR you must be logged in as a superuser in order for the Sitemap debug mode to work.

Including page URLs with URL segments

If you have page templates that support defined URL segments, then you'll likely want to include them in your sitemap too. If your URL segments are defined with your template already, then WireSitemapXML will already render them, without you having to do anything. But if your URL segments are not defined with the template, or if they happen to be dynamic or regex-based URL segments, then you'll need to handle them. See below for an example of how to do this by editing your /site/templates/sitemap-xml.php file.

$sitemap->setUrlSegmentsByTemplate('basic-page', [ 'foo', 'bar', 'hello/world' ]); 

The above example adds URL segments "foo", "bar" and "hello/world" to all pages using the "basic-page" template. When the sitemap is rendered, every page using the basic-page template will include 4 variations in the sitemap.xml output:

  • /path/to/page/
  • /path/to/page/foo/
  • /path/to/page/bar/
  • /path/to/page/hello/world/

If your URL segments vary per-page (rather than being all the same for pages using a template) then you'll want to instead add a hook in your /site/templates/sitemap-xml.php file, somewhere before the $sitemap->execute() line:

$sitemap->addHookAfter('getUrlSegments', function(HookEvent $event) {
  $page = $event->arguments(0); /** @var Page $page */
  if($page->path() === '/about-us/our-team/') {
    $event->return = [ 'designers', 'developers', 'authors' ];
  }
});

In the hook above, we are addding 3 URL segments to the page /about-us/our-team/, resulting in these 3 URLs being added to the sitemap:

  • /about-us/our-team/designers/
  • /about-us/our-team/developers/
  • /about-us/our-team/authors/

Known limitations and roadmap

Currently the output is limited to 50,000 URLs. This is a hard limit for sitemap.xml files, not from this module. (This module will happily render as many URLs as memory will allow.) The solution is to split the sitemap into multiple sitemap.xml files with a sitemap index file that connects them all. This is exactly what we want to implement in one of the next versions of this module.

Another thing being considered is basing the page loading around ProcessWire’s “raw” pages loading functions, rather than loading Page objects. This would enable the rendering to be much faster. However, it would limit the usefulness of hooks or other cases where you might want the full Page object.

Another limitation to consider is that it can be slow to render sitemap.xml output, especially if there are thousands upon thousands of URLs within it. For this reason, be sure to use the cache feature and set an appropriate cache time of perhaps 1-day (86400 seconds) or at least 1 hour (3600 seconds).

Hookable methods and API reference

bool allowPage(Page $page)
Allow given $page to be rendered in sitemap?

bool allowParent(Page $page)
Allow children to be rendered in sitemap for given page?

bool allowLanguage(Page $page, Language $language)
Allow page to be rendered in sitemap for given page?

bool allowUrlSegment(Page $page, $urlSegmentStr, $language)
Allow given URL segment string to be rendered in sitemap for given page?

array getUrlSegments(Page $page)
Get array of URL segments to render in sitemap for given page.

  • Sitemap XML API reference

    Describes the public API methods and hookable methods for the WireSitemapXML module. Also includes helpful code examples

Latest news

  • ProcessWire Weekly #520
    In the 520th issue of ProcessWire Weekly we'll check out some of the latest additions to the ProcessWire module's directory, share some highlights from the latest weekly update from Ryan, and more. Read on!
    Weekly.pw / 27 April 2024
  • ProFields Table Field with Actions support
    This week we have some updates for the ProFields table field (FieldtypeTable). These updates are primarily focused on adding new tools for the editor to facilitate input and management of content in a table field.
    Blog / 12 April 2024
  • Subscribe to weekly ProcessWire news

“We chose ProcessWire because of its excellent architecture, modular extensibility and the internal API. The CMS offers the necessary flexibility and performance for such a complex website like superbude.de. ProcessWire offers options that are only available for larger systems, such as Drupal, and allows a much slimmer development process.” —xport communication GmbH