Jump to content

MarkupSitemap


Mike Rockett

Recommended Posts

Thanks so much @d'Hinnisdaël, really appreciate the effort here ? Will review, test and merge this morning. Regarding videos, you might actually want to hold off on that for a bit, as I'd first like to add the additional sub-element class for it, and perhaps a few others as well. The Sitemap class itself comes from ThePixelDeveloper over at GitHub, and so I think I might want to upgrade that instead, to utilise the newer features / code-base.

Link to comment
Share on other sites

So I've been a bit busy this morning refactoring and brooming. A man is busy of late, but hey, gotta make some time for these things eventually.

Anyways, couple of things I've done so far:

  1. Updated the underlying sitemap package.
  2. Refactor: split some code out into concerns, added support helpers, added return types and fixed up doc blocks.
  3. Dropped support for PW 2.8 and PHP < 7.1 (should be 7.2, but fine for now)
  4. Manually pulled in the merge request with an additional static method to add the language support hooks. I don't really have a good test case to work with, but this is the idea:
use MarkupSitemap;

wire()->addHookAfter(MarkupSitemap::getAdditionalPages, function ($event) {
  $page = $event->arguments(0);
  $language = $event->arguments(1);

  MarkupSitemap::applyLanguageSupportHooks();

  // Add additional pages
});

It should internally track whether or not they've been added, but I've simply not tested it yet, so can't say for sure. ?That aside though, surely I can add the hooks directly in ___getAdditionalPages?

If you'd like to give it a spin, please check out the develop branch directly.

  • Like 2
Link to comment
Share on other sites

4 hours ago, Mike Rockett said:

If you'd like to give it a spin, please check out the develop branch directly.

Just gave it a try — works like a charm! Thanks for merging so quickly.

A thought on the hook name: would it make more sense to call it getAdditionalPageUrls (including "urls") to clarify it's returning additional URLs for a specific page? One might need a hook at some point to get additional pages independent of any other page (who knows why), which would then appropriately be called getAdditionalPages.

  • Like 1
Link to comment
Share on other sites

1 hour ago, d'Hinnisdaël said:

Just gave it a try — works like a charm! Thanks for merging so quickly.

A thought on the hook name: would it make more sense to call it getAdditionalPageUrls (including "urls") to clarify it's returning additional URLs for a specific page? One might need a hook at some point to get additional pages independent of any other page (who knows why), which would then appropriately be called getAdditionalPages.

Great stuff - and does the language support hook-setup work as expected?

You're quite right indeed, will sort that out some time tomorrow. ?

Link to comment
Share on other sites

  • 2 weeks later...
  • 2 weeks later...
  • 1 month later...

@Mike Rockett, how do you feel about preventing caching — entirely, or as a config setting — for logged in users? I think it's probably a bit unexpected that sitemap can include pages that guests have no access to ?

In cases where sitemap is cached for logged in users, cache key should probably include the roles of the user. That won't be entirely bulletproof, but at least less likely to cause problems ?

  • Like 3
Link to comment
Share on other sites

@teppo, interesting idea indeed. In my opinion though, I feel like that defeats the purpose of a sitemap, which is really designed for guest and search engine access (in which case, only templates of those pages should be added to the config in the first place). I don't see why items should be conditionally added based on role if the consumer won't make much use of it… If you can see a reason how/why they would, then I'm happy to add it in.

Link to comment
Share on other sites

@Mike Rockett, I think we're talking about different things here. Sorry, my bad — should've been more clear! ?

The thing is that currently this module caches content for logged in users as well, which in turn means that the XML sitemap that guest users then view might contain non-public content. This probably shouldn't be the case, at least not by default: if the user is logged in, it would be safer not to cache anything at all, or you may end up with a sitemap that publicly displays content that wasn't meant to be public (even if only URLs and update times)... not to mention that (as you pointed out above) including non publicly viewable URLs is not particularly useful anyway.

The idea about including roles in cache key was mostly just so that if there's indeed some use case where folks prefer current behaviour, that could still work. I can't think of many reasons for that either, though, unless it's for some sort of an internal API / index or something ?

Link to comment
Share on other sites

@teppo – fair enough, you're actually quite right. So the options are then to either have the option to turn off caching for authed users or use use role-based caching for authed users, or perhaps even dictate some more fine grained rules behind which pages can be added to the sitemap. At the end of the day, the sitemap should always resemble publicly-viewable content only, so perhaps that's a better option (though I don't know how feasible it is from an implementation perspective). Another thing to note is that individual pages can be turned off from their Sitemap settings, though perhaps in some cases, that could be quite a chore (depending on the tree-structure and whether or not said pages are all common to one parent).

Link to comment
Share on other sites

1 hour ago, Mike Rockett said:

So the options are then to either have the option to turn off caching for authed users [...]

I think this would be a very good first step, and would probably be quite enough for almost all use cases. If you could make this change, I'd be a very happy user indeed ?

1 hour ago, Mike Rockett said:

Another thing to note is that individual pages can be turned off from their Sitemap settings, though perhaps in some cases, that could be quite a chore (depending on the tree-structure and whether or not said pages are all common to one parent).

Options like these are great to have, but from my point of view automation is always the best solution (if it can be done, that is). After realizing that the cache can now contain pages that are not public I turned to this option, but doing it manually on a site with loads of non-public pages all over the place... not fun ?

Slightly off-topic:

I was under the impression that the "Templates without sitemap access" setting would also affect child pages, removing them from the sitemap, but at least on one site it doesn't work like that: parent page (the one with a template that was specifically selected here) doesn't show up, but its children still do. Not sure if that's a bug or something I've misunderstood?

Link to comment
Share on other sites

FYI, I include an "ignore" field to my templates that allows for control over a page's visibility in menus, search results, and the sitemap. Then, I add that logic to the appropriate template files:

image.png.23e6501fad13408fd0685ac54059f9ee.png

// Hide from XML sitemap
case ($page->hasField("ignore") && $page->ignore->hasValue("sitemap_ignore")):
	return "";
	break;

// Hide child pages from XML sitemap
case ($page->parent->hasField("ignore") && $page->parent->ignore->hasValue("sitemap_ignore_children")):
	return "";
	break;

Here is the exported field in case anyone else wants to do the same:

{
    "ignore": {
        "id": 392,
        "name": "ignore",
        "label": "Visibility",
        "flags": 0,
        "type": "FieldtypeOptions",
        "inputfieldClass": "InputfieldCheckboxes",
        "collapsed": 2,
        "optionColumns": 0,
        "tags": "meta",
        "icon": "ban",
        "initValue": "",
        "showIf": "",
        "themeOffset": "",
        "themeBorder": "",
        "themeColor": "",
        "columnWidth": 100,
        "required": "",
        "requiredIf": "",
        "defaultValue": "",
        "export_options": {
            "default": "1=menu_ignore|Hide page from navigation menus\n2=menu_ignore_children|Hide child pages from navigation menus\n3=search_ignore|Hide page from site search results\n4=search_ignore_children|Hide child pages from site search results\n5=sitemap_ignore|Hide page from XML sitemap\n6=sitemap_ignore_children|Hide child pages from XML sitemap"
        }
    }
}

 

  • Like 1
Link to comment
Share on other sites

@teppo – I've pushed an update to the develop branch that adds an update policy config option. I think the sensible default is to set it to guests only, which is what I've done here. Let me know if you feel it should be the other way around. The HTTP headers now also provide a reason for skipping the cache. Would you be happy to test it out?

Regarding templates without sitemap access, the idea is to only target pages. Built it way back when, so can't really remember why I did it this way, but I do remember having a reason. We could well add an option to also say that children of these templates should be excluded... ?

--

There are other issues on the repo at the moment, some of which discuss features people would like to see. Unfortunately, my time is very limited and so I invite module developers to contribute if they feel up to it (I'd be super grateful). Happy to spend time discussing things and reviewing PRs and issuing quick bug fixes, but am not able to get to feature development on this module right now…

Link to comment
Share on other sites

  • 1 month later...

Awesome module @Mike Rockett, been using this on a couple of sites last year without any hiccups.

I have a case here that I'm not sure how to handle right now.

Following this post

Or this bit more specifically

/site/templates/includes/hooks.php

/**
 * This hook modifies the default behavior of the Page::path function (and thereby Page::url)
 *
 * The primary purpose is to redefine blog posts to be accessed at a URL off the root level
 * rather than under /posts/ (where they actually live). 
 *
 */
wire()->addHookBefore('Page::path', function($event) {
  $page = $event->object;
  if($page->template == 'post') {
    // ensure that pages with template 'post' live off the root rather than '/posts/'
    $event->replace = true;
    $event->return = "/$page->name/";
  }
}); 

I set up a hook to mask a url segment that I don't want to show up in page paths in a certain section of my site.

This is all fine and dandy and working as advertised, but since the hook is living in /site/templates/ and is only affecting the front-end part of the site, the paths that show up in sitemap.xml still show the original url segment ( /posts/ in @ryan 's example)

Now ideally I'd rather not mess with PW's page path behaviour in the backend, so I'm wondering if there's a way to somehow make sitemap.xml reflect the hooks from the front-end side of things without too much hacking around?

Also new to hooks and trying to be careful around them ?

All the best and thanks in advance!

Link to comment
Share on other sites

@Andi – interesting… I'm sure this could be achieved with a hook that gets triggered when iterating pages. Haven't worked with the hook system in a while, but I guess that adding support for addHookBefore('Sitemap::page') would be a decent idea. Unfortunately though, I don't have the time to get to this right now. If you or anyone else is willing to look into implementing this, please feel free to sumit a PR. If that hasn't happened and I find some time in the next week or so, then I'll jump on it.

As a last resort, there's always the getAdditionalPages hook that was recently added. It would mean that you exclude your posts from the sitemap and use that hook to add all your posts with the correct paths.

  • Like 1
Link to comment
Share on other sites

I haven't used this module until now. I thought I might give it a try for a new site I am developping at the moment. After I installed the module and wanted to have a look at how the sitemap woud look like, I got this notice:

Quote

Notice: Trying to access array offset on value of type null in …\site\assets\cache\FileCompiler\site\modules\markup-sitemap-0.5.0\MarkupSitemap.module.php on line 413

and a few other lines as well. Plus this warning:

Quote

Warning: Cannot modify header information - headers already sent by (output started at …\site\assets\cache\FileCompiler\site\modules\markup-sitemap-0.5.0\MarkupSitemap.module.php:427) in …\site\assets\cache\FileCompiler\site\modules\markup-sitemap-0.5.0\MarkupSitemap.module.php on line 192

When editing one of the pages, there is this notice:

Quote

Notice: Trying to access array offset on value of type null in …\site\modules\markup-sitemap-0.5.0\src\Concerns\ConfiguresTabs.php on line 56

what seems to be the problem here?

PW: 3.0.148, MarkupSitemap 0.6.0

Link to comment
Share on other sites

@joe_ma – Apologies for not getting back to you, I didn't see your post until now ? I'm not quite sure why you're getting those errors, though it seems like something else is interfering with the module output and or your pages. Could you share a list of the other modules you are using and perhaps any custom code/hooks that are at play? That might help me find out what the problem is, which is important as this issue hasn't come up before and I'd like to ensure it doesn't happen to others too. Thanks!

@uiui – The module supports sites with the multisite and language support modules installed. I haven't tested it it in a while, so if you have any issues, please let me know. ? 

Link to comment
Share on other sites

Other modules:

FieldtypeTable
FormBuilder
InputfieldCKEditor
somatonic-Emailobfuscator

No hooks, everything else is out of the box. Installed profile is default, as of now without any code-modifications.

Link to comment
Share on other sites

  • bedak changed the title to MarkupSitemap: Fatal Error after updating to 0.8.1 (PHP 8+)

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...