Jump to content
Pete

Module: XML Sitemap

Recommended Posts

I'm on the verge of converting your code back to a template, this is getting silly...

Just want to clarify that the problem is not with Pete's module. There was an obscure problem with the modules installer in the core which has been fixed. I'm not certain this is the problem in your case as you are still getting a different behavior than us, but I suggest grabbing the latest copy of ProcessWire before trying anything else.

Share this post


Link to post
Share on other sites

Just want to clarify that the problem is not with Pete's module.

My apologies for implying any such thing, just getting frustrated. When I saw this module it looked like a great timesaver, and that hasn't worked out.

I'm uploading a fresh copy of PW now, we'll see how it goes.

EDIT: Still no go!

Share this post


Link to post
Share on other sites

Can you confirm that you still get this same error message and that your ProcessWire reports it's version number as 2.2?

Exception: Unable to create path: /MarkupSitemapXML/ (in /home/theseeke/public_html/wire/core/CacheFile.php line 62) This error message was shown because you are logged in as a Superuser. Error has been logged.

thanks,

Ryan

Share this post


Link to post
Share on other sites

I just downloaded a fresh/blank copy of PW 2.2 and installed this module just to make sure it didn't have anything to do with the dev site I had tested on before. But it's working as expected, creating the dir in the right place, etc. So there must be something else that I'm missing. The behavior you are describing definitely indicates a core bug. I can't think of any other possibility. I'm going to do more testing here and hope to find and push a solution shortly.

Share this post


Link to post
Share on other sites

I'm pleased to say that this issue is resolved, and when it came right down to it, Problem Existed Between Keyboard and Chair.

Stupid me forgot to read the instructions, and installed the module to /wire/modules instead of /site/modules.

Now that it's in the right place it's working fine, cheers!

Share this post


Link to post
Share on other sites

Thanks for reporting back, glad that it's working. I was stuck trying to determine how it was doing that, so it's a relief to hear it's resolved. This has still been valuable though as we did solve a bug as a result (top of this page).

Share this post


Link to post
Share on other sites

Hi there,

I've been using this module on a couple of sites in place of the template I previously used and it certainly is a boon in keeping my template folder a lot tidier — great work!

I changed line 53 to check for access when iterating the children so that user-restricted pages do not show up in the sitemap. I think this is a saner default but could be a settting too.

foreach($page->children("check_access=1") as $child) $entry .= $this->sitemapListPage($child);

Thanks,

Stephen

Share this post


Link to post
Share on other sites

Stephen: check_access=1 is default, so you don't have to include that. If you don't want to check access, then you need to use check_access=0.

  • Like 1

Share this post


Link to post
Share on other sites

Very true. I obviously visited the sitemap straight after installing, while still logged in as admin. My mistake!

Thanks, apeisa

Share this post


Link to post
Share on other sites

Want to share a few things I've added for a magazine site I'm building:

First of all, I think "priority" and "changefreq" are fairly important if you're going to have a sitemap at all. This post has some info on Google's guidelines: http://www.eduki.com...-are-important/

What I decided to do was quickly add two global fields to PW so I can set these values manually in each page:

-sitemap_priority

-changefreq

And in the code:

public function sitemapListPage($page) {
$entry = "";
$default_priority = "0.5";
$default_changefreq = "monthly";

include $this->fuel('config')->paths->templates . "sitemap_module_defaults.inc";

if ($page->sitemap_ignore == 0 || $page->path == '/') { // $page->path part added so that it ignores hiding the homepage, else you wouldn't have ANY pages returned
$modified = date ('Y-m-d', $page->modified);
$entry = "\n <url>\n";
$entry .= " <loc>{$page->httpUrl}</loc>\n";
$entry .= " <lastmod>{$modified}</lastmod>\n";

if(!empty($page->sitemap_priority)) {
$entry .= " <priority>{$page->sitemap_priority}</priority>\n";
} else {
$entry .= " <priority>{$default_priority}</priority>\n";
}

if(!empty($page->changefreq)) {
$entry .= " <changefreq>{$page->changefreq}</changefreq>\n";
} else {
$entry .= " <changefreq>{$default_changefreq}</changefreq>\n";
}

 $entry .= " </url>";
if($page->numChildren) {
foreach($page->children as $child) $entry .= $this->sitemapListPage($child);
}
}
return $entry;
}

The sitemap_module_defaults.inc file in the templates dir is so I can set some values on the fly without doing it manually:

<?php
switch($page->template->name) {
case "blog_post":
case "blog_topic":
case "blog_topic_type":
$default_priority = "0.7";
$default_changefreq = "daily";
break;
}
?>

That's done and works fine for me.

Something I found frustrating with the sitemap module was that I couldn't add virtual pages I made. For example, if there's a page with urlSegments with some kind of page manipulation they won't show up on the sitemap, for obvious reasons. So what I did was add this method to the module:

public function sitemapListVirtualPage($httpUrl, $modified = NULL, $sitemap_priority = "0.5", $changefreq = "monthly") {
$entry = "";
$modified = date ('Y-m-d', $modified);
$entry = "\n <url>\n";
$entry .= " <loc>{$httpUrl}</loc>\n";

if($modified) {
$entry .= " <lastmod>{$modified}</lastmod>\n";
}

if($sitemap_priority) {
$entry .= " <priority>" . (float)$sitemap_priority . "</priority>\n";
}

if($changefreq) {
$entry .= " <changefreq>{$changefreq}</changefreq>\n";
}

 $entry .= " </url>";
return $entry;
}

And added this include to the init method before the output is saved to cache:


public function init() {
// Intercept a request for a root URL ending in sitemap.xml and output
if (strpos($_SERVER['REQUEST_URI'], wire('config')->urls->root . 'sitemap.xml') !== FALSE) {
// Check for the cached sitemap, else generate and cache a fresh sitemap
$cache = wire('modules')->get("MarkupCache");
if(!$output = $cache->get("MarkupSitemapXML", 3600)) {
$output = "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n";
$output .= '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">';
$output .= $this->sitemapListPage(wire('pages')->get("/"));

include $this->fuel('config')->paths->templates . "sitemap_module_virtual.inc";

$output .= "\n</urlset>";
$cache->save($output);
}

header("Content-Type: text/xml");
echo $output;
exit;
}
}

And in my sitemap_module_virtual.inc file I did this:

<?php
foreach(wire('pages')->find("template=blog_topic|blog_topic_type") as $real_page) {
$output .= $this->sitemapListVirtualPage("http://" . $this->fuel('config')->httpHost . $real_page->url . "archives/", wire('pages')->get("template=blog_post,sort=-created")->modified, "0.4", "daily");
}
foreach(wire('pages')->find("template=blog_topic|blog_topic_type") as $real_page) {
$output .= $this->sitemapListVirtualPage("http://" . $this->fuel('config')->httpHost . $real_page->url . "rss/", wire('pages')->get("template=blog_post,sort=-created")->modified, "0.4", "daily");
}
?>

In this part, I have manipulated /archive/ and /rss/ sub-pages for each blog_topic and sub-topic(or blog_topic_type) using urlSegments. I wanted these to show up on the sitemap, even though they're probably not super important.

Usually I wouldn't bother doing this if it was just these kinds of pages. But what if you have countless articles with manipulated urls that don't show in the sitemap? This is perfect example of why I did this for the future as well.

I've been trying to understand how to do these things for a while, so I hope it helps someone out.

EDIT: I have a download with all of my changes if anyone's interested in taking a look: http://clintonskakun.com/processwire-docs/posts/the-xml-sitemap-module-with-priority-and-changefreq-and-more/

Edited by ClintonSkakun
  • Like 1

Share this post


Link to post
Share on other sites

Thanks for posting this, these seem like some useful additions and some good insights on sitemap.xml too.

Share this post


Link to post
Share on other sites

jukooz asked me a while back how to stick the sitemaps into a template as he's using the multisite module and currently I guess that this module will pull the sitemap for EVERY site on an install...?

This should work on a per-site basis having it in a separate template file per site as a workaround for now, but pay attention to the comments please as there are things to change per-site - please also note that this is un-tested and largely just pulled from the module and tweaked for pasting into a template for use instead of the module:

EDIT: See attachment as the forum software tries to parse a URL in the code : sitemap.txt

I've also updated the module (see first post) to v1.0.3 to check if the page is viewable before including it in the sitemap - I noticed that it was incorrectly listing pages that had no template file... oops!

  • Like 1

Share this post


Link to post
Share on other sites

Hey,

is it possible to use this module with the LanguageLocalizedURL ?

I wanne make the site multilanguage like this:

www.url.de/de/testindeutsch/

www.url.de/en/testinenglish/

Would like to hear from you... Greets Jens alias DV-JF

Share this post


Link to post
Share on other sites

While I've not tried it, I would guess that this module does not collaborate with or accommodate the LanguageLocalizedURL module in any special way. Though Pete could say for sure. 

Share this post


Link to post
Share on other sites

It doesn't accommodate it at the moment, no, but I may need this myself soon so if you can wait a week or two I may have an update (it's not at the top of my list at the moment but I might surprise myself and do it sooner ;)).

Thinking out loud, it needs:

  • Check if LanguageLocalizedURL is installed
  • Find the root page for each language
  • Generate a sitemap.xml page under each root page (it doesn't really generate a page in the database, it just outputs the sitemap if you request that URL)

I think that's it, but I need to get to grips with the LanguageLocalizedURL first.

Share this post


Link to post
Share on other sites

Pete the LLU module doesn't have separate trees, it uses a gateways page for each language in the root like "/de/" "/en/" with url segments to then get the page and switch user language.  The site structure is still all the same as without, you just use text language fields.

The parsing of the url happend automaticly through these gateway pages and it hooks into Page::path to change the url of the pages system wide. So if you do a echo $page->url you'll get the language url in the language you're currently in. Like /en/about-us/, /de/ueber-uns/. 

Have anyone tried yet if it doesn't already work?

Share this post


Link to post
Share on other sites

Soma just submitted a pull request and I merged it.

The 1.0.4 version should allow you to use this with the LanguageLocalizedURL module - let us know how you get on and thanks to Soma ;)

Share this post


Link to post
Share on other sites

Pete, thanks, but I already pulled another request, forgot to up the cache time.

Also to fully work it would also have to add the "language_published" check to see if language version of the page is really published. Will try to add that later.

Also wanted to add that I never use sitemaps for google and never will again (used to try long time ago, but it doesn't really help at all if you build the site carefully. It just eating time doing it and making sure everthing works still). It's not as easy as it first seems and can even be contra productive if not done carefully. Problem with this module as it is now, it will not find and list pages that may are added through urlSegments and I don't see a way to do it easy. Also it doesn't have weighting etc.

  • Like 1

Share this post


Link to post
Share on other sites

Thanks - merged it :)

The main reason I created it was because it gets a new site listed faster - plain and simple. That was my experience from sitemaps a few years ago at least so I hope it still happens, but it does seem to work from my experience.

I've never bothered with weighting or anything like that. Not bothered about URL segments either as for a start the crawler will find a link to that from a normal page on the site presumably, but my main goal was to get Google to crawl the page sooner than it normally would - and I think it will do this every time it looks at the sitemap and sees a new page (assuming it doesn't find it in the crawl already on its own by following links in the content).

Think of it this way - if you release a brand new site without telling Google, it will take until someone links to your site before Google is aware of its existence. On a small, personal site with no blog or comments that could be a loooong time, but the first thing I usually do is launch a site and submit it so I know it's done.

The argument against this is there are other ways of doing that as search engine companies often have pages where you can just type your domain in and they will search it sometime later, but I like to keep Google informed when I add new pages etc just in case it has a lapse in concentration and misses something :)

Share this post


Link to post
Share on other sites

just uped another pull request... :)

I think google doesn't parse the website if it has a sitemap.xml. Or has this changed? I found myself fixing others website by actually removing the sitemap form google.

Google is so fast nowadays it doesn't matter that much, as it won't be in index before index update. I could be wrong as things change all the time, but thats my personal experience and from readup. Also if you add a sitemap page to your site will help if you really worry about it. As with most seo things you have to take everything with a grain of salt at the end.

It's maybe ok if google doesn't come to your new site if it's new, but tests have shown even only 1 link to your site and google will parse it in 1 day if you have a nice structure.

Share this post


Link to post
Share on other sites

I used to post a link on my website to give my new client sites a poke. Work well. Or even in this forum.

  • Like 1

Share this post


Link to post
Share on other sites

Forgot to mention, I've pulled in the last of Soma's pull requests so this is now v1.0.5 to implement the added functionality he wrote as well as the fixes for the bugs he created :P;)

My fault for not testing :D

  • Like 1

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Robin S
      Repeater Images
      Adds options to modify Repeater fields to make them convenient for "page-per-image" usage. Using a page-per-image approach allows for additional fields to be associated with each image, to record things such as photographer, date, license, links, etc.
      When Repeater Images is enabled for a Repeater field the module changes the appearance of the Repeater inputfield to be similar (but not identical) to an Images field. The collapsed view shows a thumbnail for each Repeater item, and items can be expanded for field editing.
      Screencast

      Installation
      Install the Repeater Images module.
      Setup
      Create an image field to use in the Repeater field. Recommended settings for the image field are "Maximum files allowed" set to 1 and "Formatted value" set to "Single item (null if empty)". Create a Repeater field. Add the image field to the Repeater. If you want additional fields in the Repeater create and add these also. Repeater Images configuration
      Tick the "Activate Repeater Images for this Repeater field" checkbox. In the "Image field within Repeater" dropdown select the single image field. You must save the Repeater field settings to see any newly added Image fields in the dropdown. Adjust the image thumbnail height if you want (unlike the core Images field there is no slider to change thumbnail height within Page Edit). Note: the depth option for Repeater fields is not compatible with the Repeater Images module.
      Image uploads feature
      There is a checkbox to activate image uploads. This feature allows users to quickly and easily add images to the Repeater Images field by uploading them to an adjacent "upload" field.
      To use this feature you must add the image field selected in the Repeater Images config to the template of the page containing the Repeater Images field - immediately above or below the Repeater Images field would be a good position.
      It's recommended to set the label for this field in template context to "Upload images" or similar, and set the visibility of the field to "Closed" so that it takes up less room when it's not being used. Note that when you drag images to a closed Images field it will automatically open. You don't need to worry about the "Maximum files allowed" setting because the Repeater Images module overrides this for the upload field.
      New Repeater items will be created from the images uploaded to the upload field when the page is saved. The user can add descriptions and tags to the images while they are still in the upload field and these will be retained in the Repeater items. Images are automatically deleted from the upload field when the page is saved.
      Tips
      The "Use accordion mode?" option in the Repeater field settings is useful for keeping the inputfield compact, with only one image item open for editing at a time. The "Repeater item labels" setting determines what is shown in the thumbnail overlay on hover. Example for an image field named "image": {image.basename} ({image.width}x{image.height})  
      https://github.com/Toutouwai/RepeaterImages
      https://modules.processwire.com/modules/repeater-images/
    • By EyeDentify
      Hello There Guys.

      I am in the process of getting into making my first modules for PW and i had a question for you PHP and PW gurus in here.

      I was wondering how i could use an external library, lets say TwitterOAuth in my PW module.
      Link to library
      https://twitteroauth.com/

      Would the code below be correct or how would i go about this:
      <?PHP namespace ProcessWire; /* load the TwitterOAuth library from my Module folder */ require "twitteroauth/autoload.php"; use Abraham\TwitterOAuth\TwitterOAuth; class EyeTwitter extends WireData,TwitterOAuth implements Module { /* vars */ protected $twConnection; /* extend parent TwitterOAuth contructor $connection = new TwitterOAuth(CONSUMER_KEY, CONSUMER_SECRET, $access_token, $access_token_secret); */ public function myTwitterConnection ($consumer_key, $consumer_secret, $access_token, $access_token_secret) { /* save the connection for use later */ $this->twConnection = TwitterOAuth::__construct($consumer_key, $consumer_secret, $access_token, $access_token_secret); } } ?> Am i on the right trail here or i am barking up the wrong tree?
      I don´t need a complete solution, i just wonder if i am including the external library the right way.
      If not, then give me a few hint´s and i will figure it out.

      Thanks a bunch.

      /EyeDentify
    • By dimitrios
      Hello,
      this module can publish content of a Processwire page on a Facebook page, triggered by saving the Processwire page.
      To set it up, configure the module with a Facebook app ID, secret and a Page ID. Following is additional configuration on Facebook for developers:
      Minimum Required Facebook App configuration:
      on Settings -> Basics, provide the App Domains, provide the Site URL, on Settings -> Advanced, set the API version (has been tested up to v3.3), add Product: Facebook Login, on Facebook Login -> Settings, set Client OAuth Login: Yes, set Web OAuth Login: Yes, set Enforce HTTPS: Yes, add "http://www.example.com/processwire/page/" to field Valid OAuth Redirect URIs. This module is configurable as follows:
      Templates: posts can take place only for pages with the defined templates. On/Off switch: specify a checkbox field that will not allow the post if checked. Specify a message and/or an image for the post.
      Usage
      edit the desired PW page and save; it will post right after the initial Facebook log in and permission granting. After that, an access token is kept.
       
      Download
      PW module directory: http://modules.processwire.com/modules/auto-fb-post/ Github: https://github.com/kastrind/AutoFbPost   Note: Facebook SDK for PHP is utilized.


    • By kongondo
      FieldtypeRuntimeMarkup and InputfieldRuntimeMarkup
       
      Modules Directory: http://modules.processwire.com/modules/fieldtype-runtime-markup/
      GitHub: https://github.com/kongondo/FieldtypeRuntimeMarkup
      As of 11 May 2019 ProcessWire versions earlier than 3.x are not supported
      This module allows for custom markup to be dynamically (PHP) generated and output within a page's edit screen (in Admin).
       
      The value for the fieldtype is generated at runtime. No data is saved in the database. The accompanying InputfieldRuntimeMarkup is only used to render/display the markup in the page edit screen.
       
      The field's value is accessible from the ProcessWire API in the frontend like any other field, i.e. it has access to $page and $pages.
       
      The module was commissioned/sponsored by @Valan. Although there's certainly other ways to achieve what this module does, it offers a dynamic and flexible alternative to generating your own markup in a page's edit screen whilst also allowing access to that markup in the frontend. Thanks Valan!
       
      Warning/Consideration
      Although access to ProcessWire's Fields' admin pages is only available to Superusers, this Fieldtype will evaluate and run the custom PHP Code entered and saved in the field's settings (Details tab). Utmost care should therefore be taken in making sure your code does not perform any CRUD operations!! (unless of course that's intentional) The value for this fieldtype is generated at runtime and thus no data is stored in the database. This means that you cannot directly query a RuntimeMarkup field from $pages->find(). Usage and API
       
      Backend
      Enter your custom PHP snippet in the Details tab of your field (it is RECOMMENDED though that you use wireRenderFile() instead. See example below). Your code can be as simple or as complicated as you want as long as in the end you return a value that is not an array or an object or anything other than a string/integer.
       
      FieldtypeRuntimeMarkup has access to $page (the current page being edited/viewed) and $pages. 
       
      A very simple example.
      return 'Hello'; Simple example.
      return $page->title; Simple example with markup.
      return '<h2>' . $page->title . '</h2>'; Another simple example with markup.
      $out = '<h1>hello '; $out .= $page->title; $out .= '</h1>'; return $out; A more advanced example.
      $p = $pages->get('/about-us/')->child('sort=random'); return '<p>' . $p->title . '</p>'; An even more complex example.
      $str =''; if($page->name == 'about-us') { $p = $page->children->last(); $str = "<h2><a href='{$p->url}'>{$p->title}</a></h2>"; } else { $str = "<h2><a href='{$page->url}'>{$page->title}</a></h2>"; } return $str; Rather than type your code directly in the Details tab of the field, it is highly recommended that you placed all your code in an external file and call that file using the core wireRenderFile() method. Taking this approach means you will be able to edit your code in your favourite text editor. It also means you will be able to type more text without having to scroll. Editing the file is also easier than editing the field. To use this approach, simply do:
      return wireRenderFile('name-of-file');// file will be in /site/templates/ If using ProcessWire 3.x, you will need to use namespace as follows:
      return ProcessWire\wireRenderFile('name-of-file'); How to access the value of RuntimeMarkup in the frontend (our field is called 'runtime_markup')
       
      Access the field on the current page (just like any other field)
      echo $page->runtime_markup; Access the field on another page
      echo $pages->get('/about-us/')->runtime_markup; Screenshots
       
      Backend
       

       

       
      Frontend
       

×
×
  • Create New...