Jump to content
Pete

Module: XML Sitemap

Recommended Posts

I'm on the verge of converting your code back to a template, this is getting silly...

Just want to clarify that the problem is not with Pete's module. There was an obscure problem with the modules installer in the core which has been fixed. I'm not certain this is the problem in your case as you are still getting a different behavior than us, but I suggest grabbing the latest copy of ProcessWire before trying anything else.

Share this post


Link to post
Share on other sites

Just want to clarify that the problem is not with Pete's module.

My apologies for implying any such thing, just getting frustrated. When I saw this module it looked like a great timesaver, and that hasn't worked out.

I'm uploading a fresh copy of PW now, we'll see how it goes.

EDIT: Still no go!

Share this post


Link to post
Share on other sites

Can you confirm that you still get this same error message and that your ProcessWire reports it's version number as 2.2?

Exception: Unable to create path: /MarkupSitemapXML/ (in /home/theseeke/public_html/wire/core/CacheFile.php line 62) This error message was shown because you are logged in as a Superuser. Error has been logged.

thanks,

Ryan

Share this post


Link to post
Share on other sites

I just downloaded a fresh/blank copy of PW 2.2 and installed this module just to make sure it didn't have anything to do with the dev site I had tested on before. But it's working as expected, creating the dir in the right place, etc. So there must be something else that I'm missing. The behavior you are describing definitely indicates a core bug. I can't think of any other possibility. I'm going to do more testing here and hope to find and push a solution shortly.

Share this post


Link to post
Share on other sites

I'm pleased to say that this issue is resolved, and when it came right down to it, Problem Existed Between Keyboard and Chair.

Stupid me forgot to read the instructions, and installed the module to /wire/modules instead of /site/modules.

Now that it's in the right place it's working fine, cheers!

Share this post


Link to post
Share on other sites

Thanks for reporting back, glad that it's working. I was stuck trying to determine how it was doing that, so it's a relief to hear it's resolved. This has still been valuable though as we did solve a bug as a result (top of this page).

Share this post


Link to post
Share on other sites

Hi there,

I've been using this module on a couple of sites in place of the template I previously used and it certainly is a boon in keeping my template folder a lot tidier — great work!

I changed line 53 to check for access when iterating the children so that user-restricted pages do not show up in the sitemap. I think this is a saner default but could be a settting too.

foreach($page->children("check_access=1") as $child) $entry .= $this->sitemapListPage($child);

Thanks,

Stephen

Share this post


Link to post
Share on other sites

Stephen: check_access=1 is default, so you don't have to include that. If you don't want to check access, then you need to use check_access=0.

  • Like 1

Share this post


Link to post
Share on other sites

Very true. I obviously visited the sitemap straight after installing, while still logged in as admin. My mistake!

Thanks, apeisa

Share this post


Link to post
Share on other sites

Want to share a few things I've added for a magazine site I'm building:

First of all, I think "priority" and "changefreq" are fairly important if you're going to have a sitemap at all. This post has some info on Google's guidelines: http://www.eduki.com...-are-important/

What I decided to do was quickly add two global fields to PW so I can set these values manually in each page:

-sitemap_priority

-changefreq

And in the code:

public function sitemapListPage($page) {
$entry = "";
$default_priority = "0.5";
$default_changefreq = "monthly";

include $this->fuel('config')->paths->templates . "sitemap_module_defaults.inc";

if ($page->sitemap_ignore == 0 || $page->path == '/') { // $page->path part added so that it ignores hiding the homepage, else you wouldn't have ANY pages returned
$modified = date ('Y-m-d', $page->modified);
$entry = "\n <url>\n";
$entry .= " <loc>{$page->httpUrl}</loc>\n";
$entry .= " <lastmod>{$modified}</lastmod>\n";

if(!empty($page->sitemap_priority)) {
$entry .= " <priority>{$page->sitemap_priority}</priority>\n";
} else {
$entry .= " <priority>{$default_priority}</priority>\n";
}

if(!empty($page->changefreq)) {
$entry .= " <changefreq>{$page->changefreq}</changefreq>\n";
} else {
$entry .= " <changefreq>{$default_changefreq}</changefreq>\n";
}

 $entry .= " </url>";
if($page->numChildren) {
foreach($page->children as $child) $entry .= $this->sitemapListPage($child);
}
}
return $entry;
}

The sitemap_module_defaults.inc file in the templates dir is so I can set some values on the fly without doing it manually:

<?php
switch($page->template->name) {
case "blog_post":
case "blog_topic":
case "blog_topic_type":
$default_priority = "0.7";
$default_changefreq = "daily";
break;
}
?>

That's done and works fine for me.

Something I found frustrating with the sitemap module was that I couldn't add virtual pages I made. For example, if there's a page with urlSegments with some kind of page manipulation they won't show up on the sitemap, for obvious reasons. So what I did was add this method to the module:

public function sitemapListVirtualPage($httpUrl, $modified = NULL, $sitemap_priority = "0.5", $changefreq = "monthly") {
$entry = "";
$modified = date ('Y-m-d', $modified);
$entry = "\n <url>\n";
$entry .= " <loc>{$httpUrl}</loc>\n";

if($modified) {
$entry .= " <lastmod>{$modified}</lastmod>\n";
}

if($sitemap_priority) {
$entry .= " <priority>" . (float)$sitemap_priority . "</priority>\n";
}

if($changefreq) {
$entry .= " <changefreq>{$changefreq}</changefreq>\n";
}

 $entry .= " </url>";
return $entry;
}

And added this include to the init method before the output is saved to cache:


public function init() {
// Intercept a request for a root URL ending in sitemap.xml and output
if (strpos($_SERVER['REQUEST_URI'], wire('config')->urls->root . 'sitemap.xml') !== FALSE) {
// Check for the cached sitemap, else generate and cache a fresh sitemap
$cache = wire('modules')->get("MarkupCache");
if(!$output = $cache->get("MarkupSitemapXML", 3600)) {
$output = "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n";
$output .= '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">';
$output .= $this->sitemapListPage(wire('pages')->get("/"));

include $this->fuel('config')->paths->templates . "sitemap_module_virtual.inc";

$output .= "\n</urlset>";
$cache->save($output);
}

header("Content-Type: text/xml");
echo $output;
exit;
}
}

And in my sitemap_module_virtual.inc file I did this:

<?php
foreach(wire('pages')->find("template=blog_topic|blog_topic_type") as $real_page) {
$output .= $this->sitemapListVirtualPage("http://" . $this->fuel('config')->httpHost . $real_page->url . "archives/", wire('pages')->get("template=blog_post,sort=-created")->modified, "0.4", "daily");
}
foreach(wire('pages')->find("template=blog_topic|blog_topic_type") as $real_page) {
$output .= $this->sitemapListVirtualPage("http://" . $this->fuel('config')->httpHost . $real_page->url . "rss/", wire('pages')->get("template=blog_post,sort=-created")->modified, "0.4", "daily");
}
?>

In this part, I have manipulated /archive/ and /rss/ sub-pages for each blog_topic and sub-topic(or blog_topic_type) using urlSegments. I wanted these to show up on the sitemap, even though they're probably not super important.

Usually I wouldn't bother doing this if it was just these kinds of pages. But what if you have countless articles with manipulated urls that don't show in the sitemap? This is perfect example of why I did this for the future as well.

I've been trying to understand how to do these things for a while, so I hope it helps someone out.

EDIT: I have a download with all of my changes if anyone's interested in taking a look: http://clintonskakun.com/processwire-docs/posts/the-xml-sitemap-module-with-priority-and-changefreq-and-more/

Edited by ClintonSkakun
  • Like 1

Share this post


Link to post
Share on other sites

Thanks for posting this, these seem like some useful additions and some good insights on sitemap.xml too.

Share this post


Link to post
Share on other sites

jukooz asked me a while back how to stick the sitemaps into a template as he's using the multisite module and currently I guess that this module will pull the sitemap for EVERY site on an install...?

This should work on a per-site basis having it in a separate template file per site as a workaround for now, but pay attention to the comments please as there are things to change per-site - please also note that this is un-tested and largely just pulled from the module and tweaked for pasting into a template for use instead of the module:

EDIT: See attachment as the forum software tries to parse a URL in the code : sitemap.txt

I've also updated the module (see first post) to v1.0.3 to check if the page is viewable before including it in the sitemap - I noticed that it was incorrectly listing pages that had no template file... oops!

  • Like 1

Share this post


Link to post
Share on other sites

Hey,

is it possible to use this module with the LanguageLocalizedURL ?

I wanne make the site multilanguage like this:

www.url.de/de/testindeutsch/

www.url.de/en/testinenglish/

Would like to hear from you... Greets Jens alias DV-JF

Share this post


Link to post
Share on other sites

While I've not tried it, I would guess that this module does not collaborate with or accommodate the LanguageLocalizedURL module in any special way. Though Pete could say for sure. 

Share this post


Link to post
Share on other sites

It doesn't accommodate it at the moment, no, but I may need this myself soon so if you can wait a week or two I may have an update (it's not at the top of my list at the moment but I might surprise myself and do it sooner ;)).

Thinking out loud, it needs:

  • Check if LanguageLocalizedURL is installed
  • Find the root page for each language
  • Generate a sitemap.xml page under each root page (it doesn't really generate a page in the database, it just outputs the sitemap if you request that URL)

I think that's it, but I need to get to grips with the LanguageLocalizedURL first.

Share this post


Link to post
Share on other sites

Pete the LLU module doesn't have separate trees, it uses a gateways page for each language in the root like "/de/" "/en/" with url segments to then get the page and switch user language.  The site structure is still all the same as without, you just use text language fields.

The parsing of the url happend automaticly through these gateway pages and it hooks into Page::path to change the url of the pages system wide. So if you do a echo $page->url you'll get the language url in the language you're currently in. Like /en/about-us/, /de/ueber-uns/. 

Have anyone tried yet if it doesn't already work?

Share this post


Link to post
Share on other sites

Soma just submitted a pull request and I merged it.

The 1.0.4 version should allow you to use this with the LanguageLocalizedURL module - let us know how you get on and thanks to Soma ;)

Share this post


Link to post
Share on other sites

Pete, thanks, but I already pulled another request, forgot to up the cache time.

Also to fully work it would also have to add the "language_published" check to see if language version of the page is really published. Will try to add that later.

Also wanted to add that I never use sitemaps for google and never will again (used to try long time ago, but it doesn't really help at all if you build the site carefully. It just eating time doing it and making sure everthing works still). It's not as easy as it first seems and can even be contra productive if not done carefully. Problem with this module as it is now, it will not find and list pages that may are added through urlSegments and I don't see a way to do it easy. Also it doesn't have weighting etc.

  • Like 1

Share this post


Link to post
Share on other sites

Thanks - merged it :)

The main reason I created it was because it gets a new site listed faster - plain and simple. That was my experience from sitemaps a few years ago at least so I hope it still happens, but it does seem to work from my experience.

I've never bothered with weighting or anything like that. Not bothered about URL segments either as for a start the crawler will find a link to that from a normal page on the site presumably, but my main goal was to get Google to crawl the page sooner than it normally would - and I think it will do this every time it looks at the sitemap and sees a new page (assuming it doesn't find it in the crawl already on its own by following links in the content).

Think of it this way - if you release a brand new site without telling Google, it will take until someone links to your site before Google is aware of its existence. On a small, personal site with no blog or comments that could be a loooong time, but the first thing I usually do is launch a site and submit it so I know it's done.

The argument against this is there are other ways of doing that as search engine companies often have pages where you can just type your domain in and they will search it sometime later, but I like to keep Google informed when I add new pages etc just in case it has a lapse in concentration and misses something :)

Share this post


Link to post
Share on other sites

just uped another pull request... :)

I think google doesn't parse the website if it has a sitemap.xml. Or has this changed? I found myself fixing others website by actually removing the sitemap form google.

Google is so fast nowadays it doesn't matter that much, as it won't be in index before index update. I could be wrong as things change all the time, but thats my personal experience and from readup. Also if you add a sitemap page to your site will help if you really worry about it. As with most seo things you have to take everything with a grain of salt at the end.

It's maybe ok if google doesn't come to your new site if it's new, but tests have shown even only 1 link to your site and google will parse it in 1 day if you have a nice structure.

Share this post


Link to post
Share on other sites

I used to post a link on my website to give my new client sites a poke. Work well. Or even in this forum.

  • Like 1

Share this post


Link to post
Share on other sites

Forgot to mention, I've pulled in the last of Soma's pull requests so this is now v1.0.5 to implement the added functionality he wrote as well as the fixes for the bugs he created :P;)

My fault for not testing :D

  • Like 1

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Gadgetto
      SnipWire - Snipcart integration for ProcessWire
      Snipcart is a powerful 3rd party, developer-first HTML/JavaScript shopping cart platform. SnipWire is the missing link between Snipcart and the content management framework ProcessWire.
      With SnipWire, you can quickly turn any ProcessWire site into a Snipcart online shop. The SnipWire plugin helps you to get your store up and running in no time. Detailed knowledge of the Snipcart system is not required.
      SnipWire is free and open source licensed under Mozilla Public License 2.0! A lot of work and effort has gone into development. It would be nice if you could donate an amount to support further development:

      Status update links (inside this thread) for SnipWire development
      2020-01-21 -- Snipcart v3 - when will the new cart system be implemented? 2020-01-19 -- integrated taxes provider finished (+ very flexible shipping taxes handling) 2020-01-14 -- new date range picker, discount editor, order notifiactions, order statuses, and more ... 2019-11-15 -- orders filter, order details, download + resend invoices, refunds 2019-10-18 -- list filters, REST API improvements, new docs platform, and more ... 2019-08-08 -- dashboard interface, currency selector, managing Orders, Customers and Products, Added a WireTabs, refinded caching behavior 2019-06-15 -- taxes provider, shop templates update, multiCURL implementation, and more ... 2019-06-02 -- FieldtypeSnipWireTaxSelector 2019-05-25 -- SnipWire will be free and open source Plugin Key Features
      Fast and simple store setup Full integration of the Snipcart dashboard into the ProcessWire backend (no need to leave the ProcessWire admin area) Browse and manage orders, customers, discounts, abandoned carts, and more Process refunds and send customer notifications from within the ProcessWire backend Process Abandoned Carts + sending messages to customers from within the ProcessWire backend Complete Snipcart webhooks integration (all events are hookable via ProcessWire hooks) Integrated taxes provider (which is more flexible then Snipcart own provider) Useful Links
      SnipWire in PW modules directory (alpha version only available via GitHub) SnipWire Docs (please note that the documentation is a work in progress) SnipWire @GitHub (feature requests and suggestions for improvement are welcome - I also accept pull requests) Snipcart Website  
      ---- INITIAL POST FROM 2019-05-25 ----
       
    • By d'Hinnisdaël
      Happy new year, everybody 🥬
      I've been sitting on this Dashboard module I made for a client and finally came around to cleaning it up and releasing it to the wider public. This is how it looks.
      ProcessWire Dashboard

      If anyone is interested in trying this out, please go ahead! I'd love to get some feedback on it. If this proves useful and survives some real-world testing, I'll add this to the module directory.
      Download
      You can find the latest release on Github.
      Documentation
      Check out the documentation to get started. This is where you'll find information about included panel types and configuration options.
      Custom Panels
      My goal was to make it really simple to create custom panels. The easiest way to do that is to use the panel type template and have it render a file in your templates folder. This might be enough for 80% of all use cases. For anything more complex (FormBuilder submissions? Comments? Live chat?), you can add new panel types by creating modules that extend the DashboardPanel base class. Check out the documentation on custom panels or take a look at the HelloWorld panel to get started. I'm happy to merge any user-created modules into the main repo if they might be useful to more than a few people.
       Disclaimer
      This is a pre-release version. Please treat it as such — don't install it on production sites. Just making sure 🍇
      Roadmap
      These are the things I'm looking to implement myself at some point. The wishlist is a lot longer, but those are the 80/20 items that I probably won't regret spending time on.
      Improve documentation & add examples ⚙️ Panel types Google Analytics ⚙️ Add new page  🔥 Drafts 🔥 At a glance / Page counter 404s  Layout options Render multiple tabs per panel panel groups with heading and spacing between ✅ panel wrappers as grid item (e.g. stacked notices) ✅ Admin themes support AdminThemeReno and AdminThemeDefault ✅ Shortcuts panel add a table layout with icon, title & summary ✅ Chart panel add default styles for common chart types ✅ load chart data from JS file (currently passed as PHP array) Collection panel support image columns ✅ add buttons: view all & add new ✅
    • By Robin S
      This module is inspired by and similar to the Template Stubs module. The author of that module has not been active in the PW community for several years now and parts of the code for that module didn't make sense to me, so I decided to create my own module. Auto Template Stubs has only been tested with PhpStorm because that is the IDE that I use.
      Auto Template Stubs
      Automatically creates stub files for templates when fields or fieldgroups are saved.
      Stub files are useful if you are using an IDE (e.g. PhpStorm) that provides code assistance - the stub files let the IDE know what fields exist in each template and what data type each field returns. Depending on your IDE's features you get benefits such as code completion for field names as you type, type inference, inspection, documentation, etc.
      Installation
      Install the Auto Template Stubs module.
      Configuration
      You can change the class name prefix setting in the module config if you like. It's good to use a class name prefix because it reduces the chance that the class name will clash with an existing class name.
      The directory path used to store the stub files is configurable.
      There is a checkbox to manually trigger the regeneration of all stub files if needed.
      Usage
      Add a line near the top of each of your template files to tell your IDE what stub class name to associate with the $page variable within the template file. For example, with the default class name prefix you would add the following line at the top of the home.php template file:
      /** @var tpl_home $page */ Now enjoy code completion, etc, in your IDE.

      Adding data types for non-core Fieldtype modules
      The module includes the data types returned by all the core Fieldtype modules. If you want to add data types returned by one or more non-core Fieldtype modules then you can hook the AutoTemplateStubs::getReturnTypes() method. For example, in /site/ready.php:
      // Add data types for some non-core Fieldtype modules $wire->addHookAfter('AutoTemplateStubs::getReturnTypes', function(HookEvent $event) { $extra_types = [ 'FieldtypeDecimal' => 'string', 'FieldtypeLeafletMapMarker' => 'LeafletMapMarker', 'FieldtypeRepeaterMatrix' => 'RepeaterMatrixPageArray', 'FieldtypeTable' => 'TableRows', ]; $event->return = $event->return + $extra_types; }); Credits
      Inspired by and much credit to the Template Stubs module by mindplay.dk.
       
      https://github.com/Toutouwai/AutoTemplateStubs
      https://modules.processwire.com/modules/auto-template-stubs/
    • By Mike Rockett
      Jumplinks for ProcessWire
      Release: 1.5.60
      Composer: rockett/jumplinks
      ⚠️ NOTICE: 1.5.60 is an important security patch-release for an XSS vulnerability discovered by @phlp. It's HIGHLY RECOMMENDED that all Jumplinks users update to the latest version as soon as possible.
      Jumplinks is an enhanced version of the original ProcessRedirects by Antti Peisa.
      The Process module manages your permanent and temporary redirects (we'll call these "jumplinks" from now on, unless in reference to redirects from another module), useful for when you're migrating over to ProcessWire from another system/platform. Each jumplink supports wildcards, shortening the time needed to create them.
      Unlike similar modules for other platforms, wildcards in Jumplinks are much easier to work with, as Regular Expressions are not fully exposed. Instead, parameters wrapped in curly braces are used - these are described in the documentation.
      Under Development: 2.0, to be powered by FastRoute
      As of version 1.5.0, Jumplinks requires at least ProcessWire 2.6.1 to run.
      View on GitLab
      Download via the Modules Directory
      Read the docs
      Features
      The most prominent features include:
      Basic jumplinks (from one fixed route to another) Parameter-based wildcards with "Smart" equivalents Mapping Collections (for converting ID-based routes to their named-equivalents without the need to create multiple jumplinks) Destination Selectors (for finding and redirecting to pages containing legacy location information) Timed Activation (activate and/or deactivate jumplinks at specific times) 404-Monitor (for creating jumplinks based on 404 hits) Additionally, the following features may come in handy:
      Stale jumplink management Legacy domain support for slow migrations An importer (from CSV or ProcessRedirects) Feedback & Feature Requests
      I’d love to know what you think of this module. Please provide some feedback on the module as a whole, or even regarding smaller things that make it whole. Also, please feel free to submit feature requests and their use-cases.
      Note: Features requested so far have been added to the to-do list, and will be added to 2.0, and not the current dev/master branches.
      Open Source

      Jumplinks is an open-source project, and is free to use. In fact, Jumplinks will always be open-source, and will always remain free to use. Forever. If you would like to support the development of Jumplinks, please consider making a small donation via PayPal.
      Enjoy! 🙂
    • By Robin S
      Add Image URLs
      Allows images/files to be added to Image/File fields by pasting URLs.

      Usage
      Install the Add Image URLs module.
      A "Paste URLs" button will be added to all image and file fields. Use the button to show a textarea where URLs may be pasted, one per line. Images/files are added when the page is saved.
       
      https://github.com/Toutouwai/AddImageUrls
      https://modules.processwire.com/modules/add-image-urls/
×
×
  • Create New...