John W.

How To: Simple Sitemap.xml Generator That Optionally Excludes Children

Recommended Posts


A little guide to generating an sitemap.xml using (I believe) a script Ryan originally wrote with the addition of being able to optionally exclude child pages from being output in the sitemap.xml file.

I was looking back on a small project today where I was using a php script to generate an xml file, I believe the original was written by Ryan. Anyway, I needed a quick fix for the script to allow me to optionally exclude children of pages from being included in the sitemap.xml output.


A good example of this is a site where if you visit /minutes/ a page displays a list of board meetings which includes a title,  date, description and link to download the .pdf file.

I have a template called minutes and a template called minutes-document. The first page, minutes, when loaded via /minutes/ simply grabs all of its child pages and outputs the name, description and actual path of an uploaded .pdf file for a visitor to download.

In my back-end I have the template MINUTES and MINUTES-DOCUMENT. Thus:


So, basically, their employee can login, hover over minutes, click new, then create a new (child) record and name it the date of the meeting e.g. June 3rd, 2016 :




Outputting the sitemap.xml and optionally excluding children that belong to a template.

The setup of the original script is as follows:

1. Save the file to the templates folder as sitemap.xml.php

2. Create a template called sitemap-xml and use the sitemap.xml.php file.

3. Create a page called sitemap.xml using the sitemap-xml template


Now, with that done you will need to make only a couple of slight modifications that will allow the script to exclude children of a template from output to the sitemap.xml

1. Create a new checkbox field and name it:   sitemap_exclude_children

2. Add the field to a template that you want to control whether the children are included/excluded from the sitemap. In my example I added it to my "minutes" template.

3. Next, go to a page that uses a template with the field you added above. In my case, "MINUTES"

4. Enable the checkbox to exclude children, leave it unchecked to include children.

For example, in my MINUTES page I enabled the checkbox and now when /sitemap.xml is loaded the children for the MINUTES do not appear in the file.



A SIMPLE CONDITIONAL TO CHECK THE "sitemap_exclude_children" VALUE

This was a pretty easy modification to an existing script, adding only one line. I just figure there may be others out there using this script with the same needs.

I simply inserted the if condition as the first line in the function:

function renderSitemapChildren(Page $page) { 
	if($page->sitemap_exclude_children) return "";





 * ProcessWire Template to power a sitemap.xml 
 * 1. Copy this file to /site/templates/sitemap-xml.php
 * 2. Add the new template from the admin.
 *    Under the "URLs" section, set it to NOT use trailing slashes.
 * 3. Create a new page at the root level, use your sitemap-xml template
 *    and name the page "sitemap.xml".
 * Note: hidden pages (and their children) are excluded from the sitemap.
 * If you have hidden pages that you want to be included, you can do so 
 * by specifying the ID or path to them in an array sent to the
 * renderSiteMapXML() method at the bottom of this file. For instance:
 * echo renderSiteMapXML(array('/hidden/page/', '/another/hidden/page/')); 
 * patch to prevent pages from including children in the sitemap when a field is checked /
 * 1. create a checkbox field  named sitemap_exclude_children
 * 2. add the field to the parent template(s) you plan to use
 * 3. when a new page is create with this template, checking the field will prevent its children from being included in the sitemap.xml output

function renderSitemapPage(Page $page) {

	return 	"\n<url>" . 
		"\n\t<loc>" . $page->httpUrl . "</loc>" . 
		"\n\t<lastmod>" . date("Y-m-d", $page->modified) . "</lastmod>" . 

function renderSitemapChildren(Page $page) { 

	if($page->sitemap_exclude_children) return ""; /* Aded to exclude CHILDREN if field is checked */

	$out = '';
	$newParents = new PageArray(); 
	$children = $page->children; 
	foreach($children as $child) {
		$out .= renderSitemapPage($child);
		if($child->numChildren) $newParents->add($child); 
			else wire('pages')->uncache($child); 

	foreach($newParents as $newParent) {
		$out .= renderSitemapChildren($newParent); 

	return $out; 

function renderSitemapXML(array $paths = array()) {

	$out = 	'<?xml version="1.0" encoding="UTF-8"?>' . "\n" . 
		'<urlset xmlns="">';

	array_unshift($paths, '/'); // prepend homepage

	foreach($paths as $path) {
		$page = wire('pages')->get($path); 
		if(!$page->id) continue; 
		$out .= renderSitemapPage($page);
		if($page->numChildren) { $out .=  renderSitemapChildren($page); }

	$out .= "\n</urlset>";

	return $out; 

header("Content-Type: text/xml");
echo renderSitemapXML(); 
// Example: echo renderSitemapXML(array('/hidden/page/')); 


In conclusion, I have used a couple different processwire sitemap generating modules. But for my needs, the above script is fast and easy to setup/modify.

- Thanks


  • Like 5

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By MoritzLost
      Hello there,
      I've started using ProcessWire at work a while ago and I have been really enjoying building modular, clean and fast sites based on the CMS (at work, I usually post as @schwarzdesign). While building my first couple of websites with ProcessWire, I have written some useful helper functions for repetitive tasks. In this post I want to showcase and explain a particular function that generates a responsive image tag based on an image field, in the hope that some of you will find it useful :)
      I'll give a short explanation of responsive images and then walk through the different steps involved in generating the necessary markup & image variations. I want to keep this beginner-friendly, so most of you can probably skip over some parts.
      What are responsive images
      I want to keep this part short, there's a really good in-depth article about responsive images on MDN if you are interested in the details. The short version is that a responsive image tag is simply an <img>-tag that includes a couple of alternative image sources with different resolutions for the browser to choose from. This way, smaller screens can download the small image variant and save data, whereas high-resolution retina displays can download the extra-large variants for a crisp display experience. This information is contained in two special attributes:
      srcset - This attribute contains a list of source URLs for this image. For each source, the width of the image in pixels is specified. sizes - This attribute tells the browser how wide a space is available for the image, based on media queries (usually the width of the viewport). This is what a complete responsive image tag may look like:
      <img srcset="/site/assets/files/1015/happy_sheep_07.300x0.jpg 300w, /site/assets/files/1015/happy_sheep_07.600x0.jpg 600w, /site/assets/files/1015/happy_sheep_07.900x0.jpg 900w, /site/assets/files/1015/happy_sheep_07.1200x0.jpg 1200w, /site/assets/files/1015/happy_sheep_07.1800x0.jpg 1800w, /site/assets/files/1015/happy_sheep_07.2400x0.jpg 2400w" sizes="(min-width: 1140px) 350px, (min-width: 992px) 480px, (min-width: 576px) 540px, 100vw" src="/site/assets/files/1015/happy_sheep_07.1200x0.jpg" alt="One sheep"> This tells the browser that there are six different sources for this image available, ranging from 300px to 2400px wide variants (those are all the same image, just in different resolutions). It also tells the browser how wide the space for the image will be:
      350px for viewports >= 1140px 480px for viewports >= 992px 540px for viewports >= 576px 100vw (full viewport width) for smaller viewports The sizes queries are checked in order of appearance and the browser uses the first one that matches. So now, the browser can calculate how large the image needs to be and then select the best fit from the srcset list to download. For browsers that don't support responsive images, a medium-sized variant is included as the normal src-Attribute.
      This is quite a lot of markup which I don't want to write by hand every time I want to place an image in a ProcessWire template. The helper function will need to generate both the markup and the variations of the original image.
      Building a reusable responsive image function
      Let's start with a function that takes two parameters: a Pageimage object and a standard width. Every time you access an image field through the API in a template (e.g. $page->my_image_field), you get a Pageimage object. Let's start with a skeleton for our function:
      function buildResponsiveImage( Pageimage $img, int $standard_width ): string { $default_img = $img->maxWidth($standard_width); return '<img src="' . $default_img->url() . '" alt="' . $img->description() . '">'; } // usage example echo buildResponsiveImage($page->my_image_field, 1200); This is already enough for a normal img tag (and it will serve as a fallback for older browsers). Now let's start adding to this, trying to keep the function as flexible and reusable as possible.
      Generating alternate resolutions
      We want to add a parameter that will allow the caller to specify in what sizes the alternatives should be generated. We could just accept an array parameter that contains the desired sizes as integers. But that is not very extendible, as we'll need to specify those sizes in each function call and change them all if the normal size of the image in the layout changes. Instead, we can use an array of factors; that will allow us to set a reasonable default, and still enable us to manually overwrite it. In the following, the function gets an optional parameter $variant_factor.
      // get the original image in full size $original_img = $img->getOriginal() ?? $img; // the default image for the src attribute, it wont be upscaled $default_image = $original_img->width($standard_width, ['upscaling' => false]); // the maximum size for our generated images $full_image_width = $original_img->width(); // fill the variant factors with defaults if not set if (empty($variant_factors)) { $variant_factors = [0.25, 0.5, 0.75, 1, 1.5, 2]; } // build the srcset attribute string, and generate the corresponding widths $srcset = []; foreach ($variant_factors as $factor) { // round up, srcset doesn't allow fractions $width = ceil($standard_width * $factor); // we won't upscale images if ($width <= $full_image_width) { $srcset[] = $original_img->width($width)->url() . " {$width}w"; } } $srcset = implode(', ', $srcset); // example usage echo buildResponsiveImage($page->my_image_field, 1200, [0.4, 0.5, 0.6, 0.8, 1, 1.25, 1.5, 2]); Note that for resizing purposes, we want to get the original image through the API first, as we will generate some larger alternatives of the images for retina displays. We also don't want to generate upscaled versions of the image if the original image isn't wide enough, so I added a constraint for that.
      The great thing about the foreach-loop is that it generates the markup and the images on the server at the same time. When we call $original_img->width($width), ProcessWire automatically generates a variant of the image in that size if it doesn't exist already. So we need to do little work in terms of image manipulation.
      Generating the sizes attribute markup
      For this, we could build elaborate abstractions of the normal media queries, but for now, I've kept it very simple. The sizes attribute is defined through another array parameter that contains the media queries as strings in order of appearance.
      $sizes_attribute = implode(', ', $sizes_queries); The media queries are always separated by commas followed by a space character, so that part can be handled by the function. We'll still need to manually write the media queries when calling the function though, so that is something that can be improved upon.
      Finetuning & improvements
      This is what the function looks like now:
      function buildResponsiveImage( Pageimage $img, int $standard_width, array $sizes_queries, ?array $variant_factors = [] ): string { // get the original image in full size $original_img = $img->getOriginal() ?? $img; // the default image for the src attribute, it wont be upscaled $default_image = $original_img->width($standard_width, ['upscaling' => false]); // the maximum size for our generated images $full_image_width = $original_img->width(); // fill the variant factors with defaults if not set if (empty($variant_factors)) { $variant_factors = [0.25, 0.5, 0.75, 1, 1.5, 2]; } // build the srcset attribute string, and generate the corresponding widths $srcset = []; foreach ($variant_factors as $factor) { // round up, srcset doesn't allow fractions $width = ceil($standard_width * $factor); // we won't upscale images if ($width <= $full_image_width) { $srcset[] = $original_img->width($width)->url() . " {$width}w"; } } $srcset = implode(', ', $srcset); return '<img src="' . $default_img->url() . '" alt="' . $img->description() . '" sizes="' . $sizes_attribute . '" srcset="' . $srcset . '">'; } It contains all the part we need, but there are some optimizations to make.
      First, we can make the $sizes_queries parameters optional. The sizes attribute default to 100vw (so the browser will always download an image large enough to fill the entire viewport width). This isn't optimal as it wastes bandwidth if the image doesn't fill the viewport, but it's good enough as a fallback.
      We can also make the width optional. When I have used this function in a project, the image I passed in was oftentimes already resized to the correct size. So we can make $standard_width an optional parameter that defaults to the width of the passed image.
      if (empty($standard_width)) { $standard_width = $img->width(); } Finally, we want to be able to pass in arbitrary attributes that will be added to the element. For now, we can just add a parameter $attributes that will be an associative array of attribute => value pairs. Then we need to collapse those into html markup.
      $attr_string = implode( ' ', array_map( function($attr, $value) { return $attr . '="' . $value . '"'; }, array_keys($attributes), $attributes ) ); This will also allow for some cleanup in the way the other attributes are generated, as we can simply add those to the $attributes array along the way.
      Here's the final version of this function with typehints and PHPDoc. Feel free to use this is your own projects.
      /** * Builds a responsive image element including different resolutions * of the passed image and optionally a sizes attribute build from * the passed queries. * * @param \Processwire\Pageimage $img The base image. * @param int|null $standard_width The standard width for this image. Use 0 or NULL to use the inherent size of the passed image. * @param array|null $attributes Optional array of html attributes. * @param array|null $sizes_queries The full queries and sizes for the sizes attribute. * @param array|null $variant_factors The multiplication factors for the alternate resolutions. * @return string */ function buildResponsiveImage( \Processwire\Pageimage $img, ?int $standard_width = 0, ?array $attributes = [], ?array $sizes_queries = [], ?array $variant_factors = [] ): string { // if $attributes is null, default to an empty array $attributes = $attributes ?? []; // if the standard width is empty, use the inherent width of the image if (empty($standard_width)) { $standard_width = $img->width(); } // get the original image in full size $original_img = $img->getOriginal() ?? $img; // the default image for the src attribute, it wont be // upscaled if the desired width is larger than the original $default_image = $original_img->width($standard_width, ['upscaling' => false]); // we won't create images larger than the original $full_image_width = $original_img->width(); // fill the variant factors with defaults if (empty($variant_factors)) { $variant_factors = [0.25, 0.5, 0.75, 1, 1.5, 2]; } // build the srcset attribute string, and generate the corresponding widths $srcset = []; foreach ($variant_factors as $factor) { // round up, srcset doesn't allow fractions $width = ceil($standard_width * $factor); // we won't upscale images if ($width <= $full_image_width) { $srcset[] = $original_img->width($width)->url() . " {$width}w"; } } $attributes['srcset'] = implode(', ', $srcset); // build the sizes attribute string if ($sizes_queries) { $attributes['sizes'] = implode(', ', $sizes_queries); } // add src fallback and alt attribute $attributes['src'] = $default_image->url(); if ($img->description()) { $attriutes['alt'] = $img->description(); } // implode the attributes array to html markup $attr_string = implode(' ', array_map(function($attr, $value) { return $attr . '="' . $value . '"'; }, array_keys($attributes), $attributes)); return "<img ${attr_string}>"; } Example usage with all arguments:
      echo buildResponsiveImage( $page->testimage, 1200, ['class' => 'img-fluid', 'id' => 'photo'], [ '(min-width: 1140px) 350px', '(min-width: 992px) 480px', '(min-width: 576px) 540px', '100vw' ], [0.4, 0.5, 0.6, 0.8, 1, 1.25, 1.5, 2] ); Result:
      <img class="img-fluid" id="photo" srcset="/site/assets/files/1/sean-pierce-1053024-unsplash.480x0.jpg 480w, /site/assets/files/1/sean-pierce-1053024-unsplash.600x0.jpg 600w, /site/assets/files/1/sean-pierce-1053024-unsplash.720x0.jpg 720w, /site/assets/files/1/sean-pierce-1053024-unsplash.960x0.jpg 960w, /site/assets/files/1/sean-pierce-1053024-unsplash.1200x0.jpg 1200w, /site/assets/files/1/sean-pierce-1053024-unsplash.1500x0.jpg 1500w, /site/assets/files/1/sean-pierce-1053024-unsplash.1800x0.jpg 1800w, /site/assets/files/1/sean-pierce-1053024-unsplash.2400x0.jpg 2400w" sizes="(min-width: 1140px) 350px, (min-width: 992px) 480px, (min-width: 576px) 540px, 100vw" src="/site/assets/files/1/sean-pierce-1053024-unsplash.1200x0.jpg" alt="by Sean Pierce"> Now this is actually too much functionality for one function; also, some of the code will be exactly the same for other, similar helper functions. If some of you are interested, I'll write a second part on how to split this into multiple smaller helper functions with some ideas on how to build upon it. But this has gotten long enough, so yeah, I hope this will be helpful or interesting to some of you :)
      Also, if you recognized any problems with this approach, or can point out some possible improvements, let me know. Thanks for reading!
    • By Marco Angeli
      Hi there,
      I added a ssl certificate to my site and I'd like to redirect every single http url to its new https version
      So I added this code in the .htacces file, after the RewriteEngine On :
      Redirect 301 /about
      Unfortunately this is now working: I get the "too many redirects" error.
      The following code works, but it's a bulk redirection to the home page, something I don't want for SEO reasons (😞
      RewriteCond %{HTTP_HOST} mysite\.it [NC]
      RewriteCond %{SERVER_PORT} 80
      RewriteRule ^(.*)$$1 [R,L]
      Any suggestions?
    • By Guy Verville
      First of all, I'm not an expert on PHP. I recently read about generators and I understand their usefulness in avoiding loading a set of objects into an array to the point of saturating the memory.
      The $pages->find() call is known to be greedy (and slow) when it comes to processing large amounts of pages, because it loads all objects into memory.
      Is there a way to use a generator to avoid this problem? Is there a workaround? I know that $pages->findMany() exists, but it is also called greedy.
      Translated with
    • By gebeer
      just wanted to share something I came across while working on an import module for XML data from a web service. The XML I got was not huge, but still, loading around 3.5 MB of XML with 250+ large child nodes into memory at once with simplexml_load_file() and then looping over it had significant impact on performance.
      I searched for a solution and found this great article about how to parse large XML files.
      It basically explains how to utilize the native XMLReader class together with SimpleXMLElement to handle such situations in a memory efficient way.
      After implementing it I got a significant improval on perceived performance. No comparison in numbers to share here as I'm a bit short on time.
    • By kaba86
      Hello PW Community, really glad that discovered this CMS recently, it is very strange it took so long That idea of no front design limitations is just awesome!
      Need to say that I have a bit of knowledge of html and css, but almost no php, so I need your help.
      What I want to do is an article posting  cms, with this structure:
      - Homepage - Projects - Articles -- Category 1 --- Articles of category 1 -- Category 2 --- Articles of category 2 - About - Contact Found this ProcessWire Profile
      It covers almost all my needs, except the menu. When I add a childpage for this page , World:World doesn't appear under Writings & Publications.
      I need a menu that works like a breadcrumb, that shows on the menu the category that you are viewing. So when I'm in articles page, on the menu it shows only articles and it's categories. When I get into a category, that category takes state active link but doesn't show on the menu links and titles for contained articles. How can I do that?
      Sorry for my long writing and English, it is not my native but I hope you understood what I need. Can you help me with that?
      Thank you