Jump to content
Pete

Module: XML Sitemap

Recommended Posts

I'm on the verge of converting your code back to a template, this is getting silly...

Just want to clarify that the problem is not with Pete's module. There was an obscure problem with the modules installer in the core which has been fixed. I'm not certain this is the problem in your case as you are still getting a different behavior than us, but I suggest grabbing the latest copy of ProcessWire before trying anything else.

Share this post


Link to post
Share on other sites

Just want to clarify that the problem is not with Pete's module.

My apologies for implying any such thing, just getting frustrated. When I saw this module it looked like a great timesaver, and that hasn't worked out.

I'm uploading a fresh copy of PW now, we'll see how it goes.

EDIT: Still no go!

Share this post


Link to post
Share on other sites

Can you confirm that you still get this same error message and that your ProcessWire reports it's version number as 2.2?

Exception: Unable to create path: /MarkupSitemapXML/ (in /home/theseeke/public_html/wire/core/CacheFile.php line 62) This error message was shown because you are logged in as a Superuser. Error has been logged.

thanks,

Ryan

Share this post


Link to post
Share on other sites

I just downloaded a fresh/blank copy of PW 2.2 and installed this module just to make sure it didn't have anything to do with the dev site I had tested on before. But it's working as expected, creating the dir in the right place, etc. So there must be something else that I'm missing. The behavior you are describing definitely indicates a core bug. I can't think of any other possibility. I'm going to do more testing here and hope to find and push a solution shortly.

Share this post


Link to post
Share on other sites

I'm pleased to say that this issue is resolved, and when it came right down to it, Problem Existed Between Keyboard and Chair.

Stupid me forgot to read the instructions, and installed the module to /wire/modules instead of /site/modules.

Now that it's in the right place it's working fine, cheers!

Share this post


Link to post
Share on other sites

Thanks for reporting back, glad that it's working. I was stuck trying to determine how it was doing that, so it's a relief to hear it's resolved. This has still been valuable though as we did solve a bug as a result (top of this page).

Share this post


Link to post
Share on other sites

Hi there,

I've been using this module on a couple of sites in place of the template I previously used and it certainly is a boon in keeping my template folder a lot tidier — great work!

I changed line 53 to check for access when iterating the children so that user-restricted pages do not show up in the sitemap. I think this is a saner default but could be a settting too.

foreach($page->children("check_access=1") as $child) $entry .= $this->sitemapListPage($child);

Thanks,

Stephen

Share this post


Link to post
Share on other sites

Stephen: check_access=1 is default, so you don't have to include that. If you don't want to check access, then you need to use check_access=0.

  • Like 1

Share this post


Link to post
Share on other sites

Very true. I obviously visited the sitemap straight after installing, while still logged in as admin. My mistake!

Thanks, apeisa

Share this post


Link to post
Share on other sites

Want to share a few things I've added for a magazine site I'm building:

First of all, I think "priority" and "changefreq" are fairly important if you're going to have a sitemap at all. This post has some info on Google's guidelines: http://www.eduki.com...-are-important/

What I decided to do was quickly add two global fields to PW so I can set these values manually in each page:

-sitemap_priority

-changefreq

And in the code:

public function sitemapListPage($page) {
$entry = "";
$default_priority = "0.5";
$default_changefreq = "monthly";

include $this->fuel('config')->paths->templates . "sitemap_module_defaults.inc";

if ($page->sitemap_ignore == 0 || $page->path == '/') { // $page->path part added so that it ignores hiding the homepage, else you wouldn't have ANY pages returned
$modified = date ('Y-m-d', $page->modified);
$entry = "\n <url>\n";
$entry .= " <loc>{$page->httpUrl}</loc>\n";
$entry .= " <lastmod>{$modified}</lastmod>\n";

if(!empty($page->sitemap_priority)) {
$entry .= " <priority>{$page->sitemap_priority}</priority>\n";
} else {
$entry .= " <priority>{$default_priority}</priority>\n";
}

if(!empty($page->changefreq)) {
$entry .= " <changefreq>{$page->changefreq}</changefreq>\n";
} else {
$entry .= " <changefreq>{$default_changefreq}</changefreq>\n";
}

 $entry .= " </url>";
if($page->numChildren) {
foreach($page->children as $child) $entry .= $this->sitemapListPage($child);
}
}
return $entry;
}

The sitemap_module_defaults.inc file in the templates dir is so I can set some values on the fly without doing it manually:

<?php
switch($page->template->name) {
case "blog_post":
case "blog_topic":
case "blog_topic_type":
$default_priority = "0.7";
$default_changefreq = "daily";
break;
}
?>

That's done and works fine for me.

Something I found frustrating with the sitemap module was that I couldn't add virtual pages I made. For example, if there's a page with urlSegments with some kind of page manipulation they won't show up on the sitemap, for obvious reasons. So what I did was add this method to the module:

public function sitemapListVirtualPage($httpUrl, $modified = NULL, $sitemap_priority = "0.5", $changefreq = "monthly") {
$entry = "";
$modified = date ('Y-m-d', $modified);
$entry = "\n <url>\n";
$entry .= " <loc>{$httpUrl}</loc>\n";

if($modified) {
$entry .= " <lastmod>{$modified}</lastmod>\n";
}

if($sitemap_priority) {
$entry .= " <priority>" . (float)$sitemap_priority . "</priority>\n";
}

if($changefreq) {
$entry .= " <changefreq>{$changefreq}</changefreq>\n";
}

 $entry .= " </url>";
return $entry;
}

And added this include to the init method before the output is saved to cache:


public function init() {
// Intercept a request for a root URL ending in sitemap.xml and output
if (strpos($_SERVER['REQUEST_URI'], wire('config')->urls->root . 'sitemap.xml') !== FALSE) {
// Check for the cached sitemap, else generate and cache a fresh sitemap
$cache = wire('modules')->get("MarkupCache");
if(!$output = $cache->get("MarkupSitemapXML", 3600)) {
$output = "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n";
$output .= '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">';
$output .= $this->sitemapListPage(wire('pages')->get("/"));

include $this->fuel('config')->paths->templates . "sitemap_module_virtual.inc";

$output .= "\n</urlset>";
$cache->save($output);
}

header("Content-Type: text/xml");
echo $output;
exit;
}
}

And in my sitemap_module_virtual.inc file I did this:

<?php
foreach(wire('pages')->find("template=blog_topic|blog_topic_type") as $real_page) {
$output .= $this->sitemapListVirtualPage("http://" . $this->fuel('config')->httpHost . $real_page->url . "archives/", wire('pages')->get("template=blog_post,sort=-created")->modified, "0.4", "daily");
}
foreach(wire('pages')->find("template=blog_topic|blog_topic_type") as $real_page) {
$output .= $this->sitemapListVirtualPage("http://" . $this->fuel('config')->httpHost . $real_page->url . "rss/", wire('pages')->get("template=blog_post,sort=-created")->modified, "0.4", "daily");
}
?>

In this part, I have manipulated /archive/ and /rss/ sub-pages for each blog_topic and sub-topic(or blog_topic_type) using urlSegments. I wanted these to show up on the sitemap, even though they're probably not super important.

Usually I wouldn't bother doing this if it was just these kinds of pages. But what if you have countless articles with manipulated urls that don't show in the sitemap? This is perfect example of why I did this for the future as well.

I've been trying to understand how to do these things for a while, so I hope it helps someone out.

EDIT: I have a download with all of my changes if anyone's interested in taking a look: http://clintonskakun.com/processwire-docs/posts/the-xml-sitemap-module-with-priority-and-changefreq-and-more/

Edited by ClintonSkakun
  • Like 1

Share this post


Link to post
Share on other sites

Thanks for posting this, these seem like some useful additions and some good insights on sitemap.xml too.

Share this post


Link to post
Share on other sites

jukooz asked me a while back how to stick the sitemaps into a template as he's using the multisite module and currently I guess that this module will pull the sitemap for EVERY site on an install...?

This should work on a per-site basis having it in a separate template file per site as a workaround for now, but pay attention to the comments please as there are things to change per-site - please also note that this is un-tested and largely just pulled from the module and tweaked for pasting into a template for use instead of the module:

EDIT: See attachment as the forum software tries to parse a URL in the code : sitemap.txt

I've also updated the module (see first post) to v1.0.3 to check if the page is viewable before including it in the sitemap - I noticed that it was incorrectly listing pages that had no template file... oops!

  • Like 1

Share this post


Link to post
Share on other sites

Hey,

is it possible to use this module with the LanguageLocalizedURL ?

I wanne make the site multilanguage like this:

www.url.de/de/testindeutsch/

www.url.de/en/testinenglish/

Would like to hear from you... Greets Jens alias DV-JF

Share this post


Link to post
Share on other sites

While I've not tried it, I would guess that this module does not collaborate with or accommodate the LanguageLocalizedURL module in any special way. Though Pete could say for sure. 

Share this post


Link to post
Share on other sites

Hey Ryan,

thx for answering...

Though Pete could say for sure. 

Hope so :)

Greets

Share this post


Link to post
Share on other sites

It doesn't accommodate it at the moment, no, but I may need this myself soon so if you can wait a week or two I may have an update (it's not at the top of my list at the moment but I might surprise myself and do it sooner ;)).

Thinking out loud, it needs:

  • Check if LanguageLocalizedURL is installed
  • Find the root page for each language
  • Generate a sitemap.xml page under each root page (it doesn't really generate a page in the database, it just outputs the sitemap if you request that URL)

I think that's it, but I need to get to grips with the LanguageLocalizedURL first.

Share this post


Link to post
Share on other sites

Pete the LLU module doesn't have separate trees, it uses a gateways page for each language in the root like "/de/" "/en/" with url segments to then get the page and switch user language.  The site structure is still all the same as without, you just use text language fields.

The parsing of the url happend automaticly through these gateway pages and it hooks into Page::path to change the url of the pages system wide. So if you do a echo $page->url you'll get the language url in the language you're currently in. Like /en/about-us/, /de/ueber-uns/. 

Have anyone tried yet if it doesn't already work?

Share this post


Link to post
Share on other sites

Soma just submitted a pull request and I merged it.

The 1.0.4 version should allow you to use this with the LanguageLocalizedURL module - let us know how you get on and thanks to Soma ;)

Share this post


Link to post
Share on other sites

Pete, thanks, but I already pulled another request, forgot to up the cache time.

Also to fully work it would also have to add the "language_published" check to see if language version of the page is really published. Will try to add that later.

Also wanted to add that I never use sitemaps for google and never will again (used to try long time ago, but it doesn't really help at all if you build the site carefully. It just eating time doing it and making sure everthing works still). It's not as easy as it first seems and can even be contra productive if not done carefully. Problem with this module as it is now, it will not find and list pages that may are added through urlSegments and I don't see a way to do it easy. Also it doesn't have weighting etc.

  • Like 1

Share this post


Link to post
Share on other sites

Thanks - merged it :)

The main reason I created it was because it gets a new site listed faster - plain and simple. That was my experience from sitemaps a few years ago at least so I hope it still happens, but it does seem to work from my experience.

I've never bothered with weighting or anything like that. Not bothered about URL segments either as for a start the crawler will find a link to that from a normal page on the site presumably, but my main goal was to get Google to crawl the page sooner than it normally would - and I think it will do this every time it looks at the sitemap and sees a new page (assuming it doesn't find it in the crawl already on its own by following links in the content).

Think of it this way - if you release a brand new site without telling Google, it will take until someone links to your site before Google is aware of its existence. On a small, personal site with no blog or comments that could be a loooong time, but the first thing I usually do is launch a site and submit it so I know it's done.

The argument against this is there are other ways of doing that as search engine companies often have pages where you can just type your domain in and they will search it sometime later, but I like to keep Google informed when I add new pages etc just in case it has a lapse in concentration and misses something :)

Share this post


Link to post
Share on other sites

just uped another pull request... :)

I think google doesn't parse the website if it has a sitemap.xml. Or has this changed? I found myself fixing others website by actually removing the sitemap form google.

Google is so fast nowadays it doesn't matter that much, as it won't be in index before index update. I could be wrong as things change all the time, but thats my personal experience and from readup. Also if you add a sitemap page to your site will help if you really worry about it. As with most seo things you have to take everything with a grain of salt at the end.

It's maybe ok if google doesn't come to your new site if it's new, but tests have shown even only 1 link to your site and google will parse it in 1 day if you have a nice structure.

Share this post


Link to post
Share on other sites

I used to post a link on my website to give my new client sites a poke. Work well. Or even in this forum.

  • Like 1

Share this post


Link to post
Share on other sites

Forgot to mention, I've pulled in the last of Soma's pull requests so this is now v1.0.5 to implement the added functionality he wrote as well as the fixes for the bugs he created :P;)

My fault for not testing :D

  • Like 1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By teppo
      Needed a really simple solution to embed audio files within page content and couldn't find a module for that, so here we go. Textformatter Audio Embed works a bit like Textformatter Video Embed, converting this:
      <p>https://www.domain.tld/path/to/file.mp3</p> Into this:
      <audio controls class="TextformatterAudioEmbed"> <source src="https://www.domain.tld/path/to/file.mp3" type="audio/mpeg"> </audio> The audio element has pretty good browser support, so quite often this should be enough to get things rolling 🙂
      GitHub repository: https://github.com/teppokoivula/TextformatterAudioEmbed Modules directory: https://modules.processwire.com/modules/textformatter-audio-embed/
    • By Richard Jedlička
      Tense    
      Tense (Test ENvironment Setup & Execution) is a command-line tool to easily run tests agains multiple versions of ProcessWire CMF.
      Are you building a module, or a template and you need to make sure it works in all supported ProcessWire versions? Then Tense is exactly what you need. Write the tests in any testing framework, tell Tense which ProcessWire versions you are interested in and it will do the rest for you.

      See example or see usage in a real project.
      How to use?
      1. Install it: 
      composer global require uiii/tense 2. Create tense.yml config:
      tense init 3. Run it:
      tense run  
      For detailed instructions see Github page: https://github.com/uiii/tense
       
      This is made possible thanks to the great wireshell tool by @justb3a, @marcus and others.
       
      What do you think about it? Do you find it useful? Do you have some idea? Did you find some bug? Tell me you opinion. Write it here or in the issue tracker.
    • By Chris Bennett
      Hi all, I am going round and round in circles and would greatly appreciate if anyone can point me in the right direction.
      I am sure I am doing something dumb, or missing something I should know, but don't. Story of my life 😉

      Playing round with a module and my basic problem is I want to upload an image and also use InputfieldMarkup and other Inputfields.
      Going back and forth between trying an api generated page defining Fieldgroup, Template, Fields, Page and the InputfieldWrapper method.

      InputfieldWrapper method works great for all the markup stuff, but I just can't wrap my head around what I need to do to save the image to the database.
      Can generate a Field for it (thanks to the api investigations) but not sure what I need to do to link the Inputfield to that. Tried a lot of stuff from various threads, of varying dates without luck.
      Undoubtedly not helped by me not knowing enough.

      Defining Fieldgroup etc through the api seems nice and clean and works great for the images but I can't wrap my head around how/if I can add/append/hook the InputfieldWrapper/InputfieldMarkup stuff I'd like to include on that template as well. Not even sure if it should be where it is on ___install with the Fieldtype stuff or later on . Not getting Tracy errors, just nothing seems to happen.
      If anyone has any ideas or can point me in the right direction, that would be great because at the moment I am stumbling round in the dark.
       
      public function ___install() { parent::___install(); $page = $this->pages->get('name='.self::PAGE_NAME); if (!$page->id) { // Create fieldgroup, template, fields and page // Create new fieldgroup $fmFieldgroup = new Fieldgroup(); $fmFieldgroup->name = MODULE_NAME.'-fieldgroup'; $fmFieldgroup->add($this->fields->get('title')); // needed title field $fmFieldgroup->save(); // Create new template using the fieldgroup $fmTemplate = new Template(); $fmTemplate->name = MODULE_NAME; $fmTemplate->fieldgroup = $fmFieldgroup; $fmTemplate->noSettings = 1; $fmTemplate->noChildren = 1; $fmTemplate->allowNewPages = 0; $fmTemplate->tabContent = MODULE_NAME; $fmTemplate->noChangeTemplate = 1; $fmTemplate->setIcon(ICON); $fmTemplate->save(); // Favicon source $fmField = new Field(); $fmField->type = $this->modules->get("FieldtypeImage"); $fmField->name = 'fmFavicon'; $fmField->label = 'Favicon'; $fmField->focusMode = 'off'; $fmField->gridMode = 'grid'; $fmField->extensions = 'svg png'; $fmField->columnWidth = 50; $fmField->collapsed = Inputfield::collapsedNever; $fmField->setIcon(ICON); $fmField->addTag(MODULE_NAME); $fmField->save(); $fmFieldgroup->add($fmField); // Favicon Silhouette source $fmField = new Field(); $fmField->type = $this->modules->get("FieldtypeImage"); $fmField->name = 'fmFaviconSilhouette'; $fmField->label = 'SVG Silhouette'; $fmField->notes = 'When creating a silhouette/mask svg version for Safari Pinned Tabs and Windows Tiles, we recommend setting your viewbox for 0 0 16 16, as this is what Apple requires. In many cases, the easiest way to do this in something like illustrator is a sacrificial rectangle with no fill, and no stroke at 16 x 16. This forces the desired viewbox and can then be discarded easily using something as simple as notepad. Easy is good, especially when you get the result you want without a lot of hassle.'; $fmField->focusMode = 'off'; $fmField->extensions = 'svg'; $fmField->columnWidth = 50; $fmField->collapsed = Inputfield::collapsedNever; $fmField->setIcon(ICON); $fmField->addTag(MODULE_NAME); $fmField->save(); $fmFieldgroup->add($fmField); // Create: Open Settings Tab $tabOpener = new Field(); $tabOpener->type = new FieldtypeFieldsetTabOpen(); $tabOpener->name = 'fmTab1'; $tabOpener->label = "Favicon Settings"; $tabOpener->collapsed = Inputfield::collapsedNever; $tabOpener->addTag(MODULE_NAME); $tabOpener->save(); // Create: Close Settings Tab $tabCloser = new Field(); $tabCloser->type = new FieldtypeFieldsetClose; $tabCloser->name = 'fmTab1' . FieldtypeFieldsetTabOpen::fieldsetCloseIdentifier; $tabCloser->label = "Close open tab"; $tabCloser->addTag(MODULE_NAME); $tabCloser->save(); // Create: Opens wrapper for Favicon Folder Name $filesOpener = new Field(); $filesOpener->type = new FieldtypeFieldsetOpen(); $filesOpener->name = 'fmOpenFolderName'; $filesOpener->label = 'Wrap Folder Name'; $filesOpener->class = 'inline'; $filesOpener->collapsed = Inputfield::collapsedNever; $filesOpener->addTag(MODULE_NAME); $filesOpener->save(); // Create: Close wrapper for Favicon Folder Name $filesCloser = new Field(); $filesCloser->type = new FieldtypeFieldsetClose(); $filesCloser->name = 'fmOpenFolderName' . FieldtypeFieldsetOpen::fieldsetCloseIdentifier; $filesCloser->label = "Close open fieldset"; $filesCloser->addTag(MODULE_NAME); $filesCloser->save(); // Create Favicon Folder Name $fmField = new Field(); $fmField->type = $this->modules->get("FieldtypeText"); $fmField->name = 'folderName'; $fmField->label = 'Favicon Folder:'; $fmField->description = $this->config->urls->files; $fmField->placeholder = 'Destination Folder for your generated favicons, webmanifest and browserconfig'; $fmField->columnWidth = 100; $fmField->collapsed = Inputfield::collapsedNever; $fmField->setIcon('folder'); $fmField->addTag(MODULE_NAME); $fmField->save(); $fmFieldgroup->add($tabOpener); $fmFieldgroup->add($filesOpener); $fmFieldgroup->add($fmField); $fmFieldgroup->add($filesCloser); $fmFieldgroup->add($tabCloser); $fmFieldgroup->save(); /////////////////////////////////////////////////////////////// // Experimental Markup Tests $wrapperFaviconMagic = new InputfieldWrapper(); $wrapperFaviconMagic->attr('id','faviconMagicWrapper'); $wrapperFaviconMagic->attr('title',$this->_('Favicon Magic')); // field show info what $field = $this->modules->get('InputfieldMarkup'); $field->name = 'use'; $field->label = __('How do I use it?'); $field->collapsed = Inputfield::collapsedNever; $field->icon('info'); $field->attr('value', 'Does this even begin to vaguely work?'); $field->columnWidth = 50; $wrapperFaviconMagic->add($field); $fmTemplate->fields->add($wrapperFaviconMagic); $fmTemplate->fields->save(); ///////////////////////////////////////////////////////////// // Create page $page = $this->wire( new Page() ); $page->template = MODULE_NAME; $page->parent = $this->wire('pages')->get('/'); $page->addStatus(Page::statusHidden); $page->title = 'Favicons'; $page->name = self::PAGE_NAME; $page->process = $this; $page->save(); } }  
    • By Sebi
      Since it's featured in ProcessWire Weekly #310, now is the time to make it official:
      Here is Twack!
      I really like the following introduction from ProcessWire Weekly, so I hope it is ok if I use it here, too. Look at the project's README for more details!
      Twack is a new — or rather newish — third party module for ProcessWire that provides support for reusable components in an Angular-inspired way. Twack is implemented as an installable module, and a collection of helper and base classes. Key concepts introduced by this module are:
      Components, which have separate views and controllers. Views are simple PHP files that handle the output for the component, whereas controllers extend the TwackComponent base class and provide additional data handling capabilities. Services, which are singletons that provide a shared service where components can request data. The README for Twack uses a NewsService, which returns data related to news items, as an example of a service. Twack components are designed for reusability and encapsulating a set of features for easy maintainability, can handle hierarchical or recursive use (child components), and are simple to integrate with an existing site — even when said site wasn't originally developed with Twack.
      A very basic Twack component view could look something like this:
      <?php namespace ProcessWire; ?> <h1>Hello World!</h1> And here's how you could render it via the API:
      <?php namespace Processwire; $twack = $modules->get('Twack'); $hello = $twack->getNewComponent('HelloWorld'); ?> <html> <head> <title>Hello World</title> </head> <body> <?= $hello->render() ?> </body> </html> Now, just to add a bit more context, here's a simple component controller:
      <?php namespace ProcessWire; class HelloWorld extends TwackComponent { public function __construct($args) { parent::__construct($args); $this->title = 'Hello World!'; if(isset($args['title'])) { $this->title = $args['title']; } } } As you can see, there's not a whole lot new stuff to learn here if you'd like to give Twack a try in one of your projects. The Twack README provides a really informative and easy to follow introduction to all the key concepts (as well as some additional examples) so be sure to check that out before getting started. 
      Twack is in development for several years and I use it for every new project I build. Also integrated is an easy to handle workflow to make outputs as JSON, so it can be used to build responses for a REST-api as well. I will work that out in one section in the readme as well. 
      If you want to see the module in an actual project, I have published the code of www.musical-fabrik.de in a repository. It runs completely with Twack and has an app-endpoint with ajax-output as well.
      I really look forward to hear, what you think of Twack🥳!
      Features Installation Usage Quickstart: Creating a component Naming conventions & component variants Component Parameters directory page parameters viewname Asset handling Services Named components Global components Ajax-Output Configuration Versioning License Changelog
    • By Robin S
      Page Reference Default Value
      Most ProcessWire core inputfield types that can be used with a Page Reference field support a "Default value" setting. This module extends support for default values to the following core inputfield types:
      Page List Select Page List Select Multiple Page Autocomplete (single and multiple) Seeing as these inputfield types only support the selection of pages a Page List Select / Page List Select Multiple is used for defining the default value instead of the Text / Textarea field used by the core for other inputfield types. This makes defining a default value a bit more user-friendly.
      Note that as per the core "Default value" setting, the Page Reference field must be set to "required" in order for the default value to be used.
      Screenshot

       
      https://github.com/Toutouwai/PageReferenceDefaultValue
      https://modules.processwire.com/modules/page-reference-default-value/
×
×
  • Create New...