Jump to content
ryan

Load RSS Feeds (MarkupLoadRSS)

Recommended Posts

ProcessWire RSS Feed Loader


Given an RSS feed URL, this module will pull it, and let you foreach() it or render it. This module will also cache feeds that you retrieve with it. The module is designed for ProcessWire 2.1+, but may also work with 2.0 (haven't tried yet).

This module is the opposite of the MarkupRSS module that comes with ProcessWire because that module creates RSS feeds. Whereas this module loads them and gives you easy access to the data to do whatever you want.

For a simple live example of this module in use, see the processwire.com homepage (and many of the inside pages) for the "Latest Forum Post" section in the sidebar.

Download at: https://github.com/r...n/MarkupLoadRSS

REQUIREMENTS


This module requires that your PHP installation have the 'allow_url_fopen' option enabled. By default, it is enabled in PHP. However, some hosts turn it off for security reasons. This module will prevent itself from being installed if your system doesn't have allow_url_fopen. If you run into this problem, let me know as we may be able to find some other way of making it work without too much trouble.

INSTALLATION


The MarkupLoadRSS module installs in the same way as all PW modules:

1. Copy the MarkupLoadRSS.module file to your /site/modules/ directory.

2. Login to ProcessWire admin, click 'Modules' and 'Check for New Modules'.

3. Click 'Install' next to the Markup Load RSS module.

USAGE


The MarkupLoadRSS module is used from your template files. Usage is described with these examples:

Example #1: Cycling through a feed

<?php

  $rss = $modules->get("MarkupLoadRSS");
  $rss->load("http://www.di.net/articles/rss/");

  foreach($rss as $item) {
      echo "<p>";
      echo "<a href='{$item->url}'>{$item->title}</a> ";
      echo $item->date . "<br /> ";
      echo $item->description;
      echo "</p>";
  }

Example #2: Using the built-in rendering

<?php

  $rss = $modules->get("MarkupLoadRSS");
  echo $rss->render("http://www.di.net/articles/rss/");

Example #3: Specifying options and using channel titles

<?php

  $rss = $modules->get("MarkupLoadRSS");

  $rss->limit = 5;
  $rss->cache = 0;
  $rss->maxLength = 255;
  $rss->dateFormat = 'm/d/Y H:i:s';

  $rss->load("http://www.di.net/articles/rss/");

  echo "<h2>{$rss->title}</h2>";
  echo "<p>{$rss->description}</p>";
  echo "<ul>";

  foreach($rss as $item) {
       echo "<li>" . $item->title . "</li>";
  }

  echo "</ul>";

OPTIONS


Options MUST be set before calling load() or render().

<?php

  // specify that you want to load up to 3 items (default = 10)
  $rss->limit = 3;

  // set the feed to cache for an hour (default = 120 seconds)
  // if you want to disable the cache, set it to 0.
  $rss->cache = 3600;

  // set the max length of any field, i.e. description (default = 2048)
  // field values longer than this will be truncated
  $rss->maxLength = 255;

  // tell it to strip out any HTML tags (default = true)
  $rss->stripTags = true;

  // tell it to encode any entities in the feed (default = true);
  $rss->encodeEntities = true;

  // set the date format used for output (use PHP date string)
  $rss->dateFormat = "Y-m-d g:i a";

See the $options array in the class for more options. You can also customize all output produced by the render() method, though it is probably easier just to foreach() the $rss yourself. But see the module class file and $options array near the top to see how to change the markup that render() produces.

MORE DETAILS


This module loads the given RSS feed and all data from it. It then populates that data into a WireArray of Page-like objects. All of the fields in the RSS <items> feed are accessible, so you use whatever the feed provides. The most common and expected field names in the RSS channel are:

  • $rss->title
  • $rss->pubDate (or $rss->date)
  • $rss->description (or $rss->body)
  • $rss->link (or $rss->url)
  • $rss->created (unix timestamp of pubDate)

The most common and expected field names for each RSS item are:

  • $item->title
  • $item->pubDate (or $item->date)
  • $item->description (or $item->body)
  • $item->link (or $item->url)
  • $item->created (unix timestamp of pubDate)

For convenience and consistency, ProcessWire translates some common RSS fields to the PW-equivalent naming style. You can choose to use either the ProcessWire-style name or the traditional RSS name, as shown above.

HANDLING ERRORS


If an error occurred when loading the feed, the $rss object will have 0 items in it:

<?php

  $rss->load("...");
  if(!count($rss)) { error }

In addition, the $rss->error property always contains a detailed description of what error occurred:

<?php

  if($rss->error) { echo "<p>{$rss->error}</p>"; }

I recommend only checking for or reporting errors when you are developing and testing. On production sites you should skip

error checking/testing, as blank output is a clear indication of an error. This module will not throw runtime exceptions so if an error occurs, it's not going to halt the site.

  • Like 1

Share this post


Link to post
Share on other sites

Great to have such a module, thanks Ryan!

Though can't install:

Warning: mkdir() [function.mkdir]: No such file or directory in /Applications/XAMPP/xamppfiles/htdocs/pw2.ch/site/modules/MarkupLoadRSS.module on line 447

Share this post


Link to post
Share on other sites

Oops, looks like I messed up something last minute. Just committed the fix. Thanks for letting me know.

Share this post


Link to post
Share on other sites

Thanks, just installed and tested a run and works great so far!  8)

Share this post


Link to post
Share on other sites

Great work Ryan! Only thing I might add is support for multiple feeds. Though it might complicate this module too much?

Share this post


Link to post
Share on other sites

I want to display a RSS feed that contains items like below and it works well, except for the author field (dc:creator), which isn't parsed. Is there a way to parse this value as well?

		<item>
	<title>Taalkundigen Uppsala ontcijferen geheimschrift</title>
	<link>http://www.wereldwijzerzweden.net/2011/11/03/uppsala-geheimschrift-taalkundige-copiale/</link>
	<comments>http://www.wereldwijzerzweden.net/2011/11/03/uppsala-geheimschrift-taalkundige-copiale/#comments</comments>
	<pubDate>Thu, 03 Nov 2011 16:03:16 +0000</pubDate>
	<dc:creator>Marcel Burger</dc:creator>
	<category><![CDATA[Actueel]]></category>
	<category><![CDATA[berlijn]]></category>
	<category><![CDATA[Copiale]]></category>
	<category><![CDATA[geheimschrift]]></category>
	<category><![CDATA[universiteit]]></category>
	<category><![CDATA[uppsala]]></category>

	<guid isPermaLink="false">http://www.wereldwijzerzweden.net/?p=7227</guid>
	<description><![CDATA[<a href="http://www.wereldwijzerzweden.net/2011/11/03/uppsala-geheimschrift-taalkundige-copiale/"><img align="left" hspace="5" width="150" src="http://www.wereldwijzerzweden.net/images/copiale_280.jpg" class="alignleft wp-post-image tfe" alt="Deel uit vrijgegeven beeld van het Copialeschrift" title="copiale_280.jpg" /></a>3 november 2011 &#124; Twee Zweedse taalkundigen en een Amerikaanse wetenschapper zijn erin geslaagd een 280 jaar oud geheimschrift uit Duitsland met voorheen onbegrijpelijke tekens te vertalen.]]></description>
	<wfw:commentRss>http://www.wereldwijzerzweden.net/2011/11/03/uppsala-geheimschrift-taalkundige-copiale/feed/</wfw:commentRss>
	<slash:comments>0</slash:comments>
	</item>

I outputted the $rss array with print_r(); and it doesn't contain the dc:creator field (some others seem to be missing as well,  but I don't need these  ;))

/Jasper

Share this post


Link to post
Share on other sites

Just a guess, but do the other fields it's nor parsing contain a colon as well?

Share this post


Link to post
Share on other sites

If I recall correctly, SimpleXML doesn't work with the properties that have colons in them. But you can fix that by replacing the colon properties with underscore properties in the XML data. So in this case, you'd want to add this line in the load() function:

<?php
public function load($url) { 
    $this->items = new WireArray();
    $xmlData = $this->loadXmlData($url);
    $xmlData = str_replace('dc:creator', 'dc_creator', $xmlData); 

Or you may be able to cover all the colon properties at once using a regexp like this:

<?php
$xmlData = preg_replace('{(</?[_a-z0-9]+)[_a-z0-9]+>)}', '$1_$2', $xmlData); 

What that does is convert properties like <dc:creator> to <dc_creator> so that SimpleXML will understand them and likewise you can access them in the module. Let me know if this works for you. I'm not in a place where I can update the source on this module today, but will plan to add something like the above soon.

I don't know why the <comments> property wouldn't be getting parsed, as that appears to just be a string (URL). I need to test and experiment with that one to find out why.

Share this post


Link to post
Share on other sites

Thanks Ryan, replacing the colons work, both with the str_replace and the regexp..

I'm not in a place where I can update the source on this module today, but will plan to add something like the above soon.

I also submitted (via Github) a double encoding issue (I am good in finding these  :P) in this module. You might want to take a look at that one at the same time. :-)

I don't know why the <comments> property wouldn't be getting parsed, as that appears to just be a string (URL). I need to test and experiment with that one to find out why.

My fault  :-[, the comments property is parsed. One that didn't get parsed was the Category, but that may be because it appears multiple times. (guess).

The exact feed I am using is also in the Github issue, so you can test with it if you want/like.

/Jasper

Share this post


Link to post
Share on other sites

Thanks for submitting the issue, I will fix. Also I'd like to find a way to get Comments (and any multi-item properties) working as well, should be easy. The feeds I'd originally tested with were pretty basic and didn't have these extended properties.

Share this post


Link to post
Share on other sites

Great work Ryan! Only thing I might add is support for multiple feeds. Though it might complicate this module too much?

I had need for multiple feeds and it seemed to be pretty straightforward implementation. Only few modifications to load method:

public function load($url) {
 $this->items = new WireArray();
 if (is_array($url)) {
  $items = array();
  foreach ($url as $feed) {
   $xmlData = $this->loadXmlData($feed);
   $xml = simplexml_load_string($xmlData);
   $items = array_merge($items, $xml->xpath('/rss//item'));
  }
  $rss = simplexml_load_string($xmlData);
 } else {
  $xmlData = $this->loadXmlData($url);
  $rss = simplexml_load_string($xmlData);
 }

 if(!$rss) {
  $msg = "Unable to load RSS feed at " . htmlentities($url) . ": \n";
  foreach(libxml_get_errors() as $error) $msg .= $error . " \n";
  $this->error($msg);
  return $this;
 }
 $this->channel['title'] = $this->cleanText((string) $rss->channel->title);
 $this->channel['description'] = $this->cleanText((string) $rss->channel->description);
 $this->channel['link'] = $this->cleanText((string) $rss->channel->link);
 $this->channel['created'] = strtotime((string) $rss->channel->pubDate);
 $this->channel['pubDate'] = date($this->options['dateFormat'], $this->channel['created']);
 $n = 0;
 // If we already have $items set, it means we are dealing with multiple sources. Let's sort them
 if(isset($items)) {
  usort($items, function ($x, $y) {
 return strtotime($y->pubDate) - strtotime($x->pubDate);
  });
 } else {
  $items = $rss->channel->item;
 }
 foreach($items as $item) {
  $a = new MarkupLoadRSSItem();
  foreach($item as $key => $value) {
   $value = (string) $value;
   if($key == 'pubDate') {
 $value = strtotime($value);
 $a->set('created', $value);
 $value = date($this->options['dateFormat'], $value);
   } else {
 $value = $this->cleanText($value);
   }
   $a->set($key, $value);
  }
  $this->items->add($a);
  if(++$n >= $this->options['limit']) break;
 }
 return $this;
}

What it does it sniffs if $url is array, then loads/caches all those and merge their rss-items to $items array. Then later on that $items is sorted by pubDate. So this is fully backwards compatible => just give it an array instead of single url if you need to parse multiple feeds.

If you guys can test it works for you too then maybe Ryan you can put this on your version. I can do pull request if you want to (although it seems that new and fancy GitHub for windows does mess up line endings..).

  • Like 1

Share this post


Link to post
Share on other sites

I'm not sure that the W3 validator is picking it up right either? Seems like it is showing the whole thing as double entity encoded. Also tried loading in Safari, and it can't seem to read the feed correctly either. Firefox seems okay. Definitely something unusual going on with this feed, but I am not familiar enough with this particular format to know what's wrong. W3 validator isn't helping much since it's seeing the whole thing as double entity encoded.

Share this post


Link to post
Share on other sites

Yeah, there definitely was some strange going on with that feed. Now it seems to be working on my end too, so they must have been fixed that.

Ryan: have you thought about adding that multisource functionality to this module? I am already using it in couple of places, and it has been working great. Of course if you think the implementation should be different or alltogether different module then let me know (or if you prefer github pull request).

What I was thinking it might be more "pw" to have add->(source_url) etc and then load, instead of having all the urls in array load($array_of_urls) like it is currently.

Share this post


Link to post
Share on other sites

Sounds good! Lets go for a pull request if it's convenient for you. Or just post the .module file and I can go through the diff.

Share this post


Link to post
Share on other sites

Hi Ryan,

Thanks for this module.  Have been using it on our main site for awhile now.  Just wanted to let you know of an issue that I just discovered that others may run into, and see if there's a way to handle it.

I was trying to load a feed that for awhile was not responding.  The feed page wasn't throwing an error or even timing out, just loading for minutes on end.

This ended up causing a timeout on our site (the feed was loading on the main page) and producing this error in the PW log file:

Error Exception: MySQL server has gone away (in /mnt/stor7-wc2-dfw1/526843/www.agencypja.com/web/content/wire/core/Database.php line 118)

For now, we've just disabled that feed, but we are using the module to load other feeds.  Do you (or anyone else) know of a way to address this issue?  I don't see a timeout option in the module, but could certainly look into adding one if that determined to be the best option.

Thanks.

Share this post


Link to post
Share on other sites

Could you PM me the RSS feed you are working with? I can do some testing here. I believe we can get it working by switching MarkupLoadRSS to use the new WireHttp class in PW 2.3.10+, but I need an example to test with. 

Share this post


Link to post
Share on other sites

Hi Ryan,

Unfortunately, the feed that was causing problems is now back up and running normally.  I thought that I could recreate the issue by creating a php page on another server with a timeout set to at least 5 min, sleeping the script, and using that as the RSS feed, but that didn't work.

I'll be sure to let you know if I ever come across is again.

Thanks.

  • Like 1

Share this post


Link to post
Share on other sites

I get always empty RSS Feed output! On my Page i call the RSS module by url selector like blog/rss and for the output i need the same page array that i use for the /blog page. But in the RSS Feed i get no content!

$blogposts = $pages->find("template=post, publish_date<$today, sort=-publish_date, limit=10");

if($input->urlSegment1 === 'rss'){
  // retrieve the RSS module
  $rss = $modules->get("MarkupRSS");

  // configure the feed. see the actual module file for more optional config options.
  $rss->title = "Letzte Blogeinträge";

  $rss->render($blogposts);
  return;
} else {
  $content = renderPosts($blogposts, true);
}

Share this post


Link to post
Share on other sites

I just downloaded MarkupLoadRSS module from M.Cramer's Github repo.

Here is the demo code I used for test purpose in my template :

        $rss = $modules->get("MarkupLoadRSS");
        $rss->load("http://rss.cbc.ca/lineup/canada.xml");

        foreach($rss as $item) {
            echo "<p>";
            echo "<a href='{$item->url}'>{$item->title}</a> ";
            echo $item->date . "<br /> ";
            echo $item->description;
            echo "</p>";
        }

All I get is this error :

Error: Call to a member function load() on a non-object (line 65 of C:\wamp\www\mysite\site\templates\home.php)

As if the module wants an object, like a $page or $config or something...

I looked up the code and the function is load($url). Would it be conflicting with something ?

I'm running v2.7.2

Edited by kongondo
merged your topic here, the module's support forum

Share this post


Link to post
Share on other sites

The error says that $rss is not an object, so the call $modules->get doesn't return a module instance. Did you install the module after downloading it?

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Mike Rockett
      Jumplinks for ProcessWire
      Release: 1.5.56
      Composer: rockett/jumplinks
      Jumplinks is an enhanced version of the original ProcessRedirects by Antti Peisa.
      The Process module manages your permanent and temporary redirects (we'll call these "jumplinks" from now on, unless in reference to redirects from another module), useful for when you're migrating over to ProcessWire from another system/platform. Each jumplink supports wildcards, shortening the time needed to create them.
      Unlike similar modules for other platforms, wildcards in Jumplinks are much easier to work with, as Regular Expressions are not fully exposed. Instead, parameters wrapped in curly braces are used - these are described in the documentation.
      Under Development: 2.0, to be powered by FastRoute
      As of version 1.5.0, Jumplinks requires at least ProcessWire 2.6.1 to run.
      View on GitLab
      Download via the Modules Directory
      Read the docs
      Features
      The most prominent features include:
      Basic jumplinks (from one fixed route to another) Parameter-based wildcards with "Smart" equivalents Mapping Collections (for converting ID-based routes to their named-equivalents without the need to create multiple jumplinks) Destination Selectors (for finding and redirecting to pages containing legacy location information) Timed Activation (activate and/or deactivate jumplinks at specific times) 404-Monitor (for creating jumplinks based on 404 hits) Additionally, the following features may come in handy:
      Stale jumplink management Legacy domain support for slow migrations An importer (from CSV or ProcessRedirects) Feedback & Feature Requests
      I’d love to know what you think of this module. Please provide some feedback on the module as a whole, or even regarding smaller things that make it whole. Also, please feel free to submit feature requests and their use-cases.
      Note: Features requested so far have been added to the to-do list, and will be added to 2.0, and not the current dev/master branches.
      Open Source

      Jumplinks is an open-source project, and is free to use. In fact, Jumplinks will always be open-source, and will always remain free to use. Forever. If you would like to support the development of Jumplinks, please consider making a small donation via PayPal.
      Enjoy! :)
    • By BitPoet
      As threatened in Ryan's announcement for 3.0.139, I built a little module for sliding toggles as a replacement for checkboxes. Styling of the input is CSS3 only (with all the usual caveats about older browsers), no JS necessary, and may still be a bit "rough around the edges", so to speak, since I didn't have much time for testing on different devices or brushing things up enough so I'd feel comfortable pushing it to the module directory. But here's the link to the GitHub repo for now:
      InputfieldSlideToggle
      Fieldtype and Inputfield that implements smartphone-style toggles as replacement for checkbox inputs. The visualization is CSS-only, no additional JS necessary.
      Status
      Still very alpha, use with caution!
      Features / Field Settings
      Size
      You can render the toggles in four different sizes: small, medium, large and extra large.
      Off Color
      Currently, "unchecked" toggles can be displayed either in grey (default) or red.
      On Color
      "Checked" toggles can be rendered in one of these colors: blue (default), black, green, grey, orange or red.
      Screenshots

      Some examples with checkbox label


      View all Size and Color Combinations
      Small toggles Medium toggles Big toggles Extra big toggles  









    • By Orkun
      Hi Guys
      I needed to add extended functionalities for the InputfieldDatetime Module (module is from processwire version 2.7.3) because of a Request of Customer.
      So I duplicated the module and placed it under /site/modules/.
      I have added 3 new Settings to the InputfieldDatetime Module.
      1. Day Restriction - Restrict different days based on weekdays selection (e.g. saturday, sunday) - WORKING

       
      2. Time Slots - Define Time slots based on custom Integer Value (max is 60 for 1 hour) - WORKING

       
      3. Time Range Rules per Weekday - Define a minTime and MaxTime per Weekday (e.g. Opening Hours of a Restaurant) - NOT WORKING PROPERLY

       
      The Problem
      Time Slots and Day Restriction working fine so far. But the Time Range Rules per Weekday doesn't work right.
      What should happen is, that when you click on a date, it should update the minTime and maxTime of the Time Select.
      But the change on the select only happens if you select a date 2 times or when you select a date 1 time and then close the datepicker and reopen it again.
      The time select doesn't get change when you select a date 1 time and don't close the picker.
      Here is the whole extended InputfieldDatetime Module.
      The Files that I have changed:
      InputfieldDatetime.module InputfieldDatetime.js jquery-ui-timepicker-addon.js (https://trentrichardson.com/examples/timepicker/) - updated it to the newest version, because minTime and maxTime Option was only available in the new version  
      Thats the Part of the JS that is not working correctly:
      if(datetimerules && datetimerules.length){ options.onSelect = function(date, inst) { var day = $(this).datetimepicker("getDate").getDay(); day = day.toString(); var mintime = $(this).attr('data-weekday'+day+'-mintime'); var maxtime = $(this).attr('data-weekday'+day+'-maxtime'); console.log("weekday: "+day); console.log("minTime: "+mintime); console.log("maxTime: "+maxtime); var optionsAll = $(this).datetimepicker( "option", "all" ); optionsAll.minTime = mintime; optionsAll.maxTime = maxtime; $(this).datetimepicker('destroy'); $(this).datetimepicker(optionsAll); $(this).datetimepicker('refresh'); //$.datepicker._selectDate($(this).attr("id"),date); //$.datepicker._base_getDateDatepicker(); // var inst = $.datepicker._getInst($(this)); // $.datepicker._updateDatepicker(inst); /*$(this).datetimepicker('destroy'); InputfieldDatetimeDatepicker($(this), mintime, maxtime); $(this).datetimepicker('refresh'); */ // $(this).datetimepicker('option', {minTime: mintime, maxTime: maxtime}); } } Can you have a look and find out what the Problem is?
      InputfieldDatetime.zip
       
      Kind Regards
      Orkun
    • By teppo
      This module tracks changes, additions, removals etc. of public (as in "not under admin") pages of your site. Like it's name says, it doesn't attempt to be a version control system or anything like that - just a log of what's happened.
      At the moment it's still a work in progress and will most likely be a victim of many ruthless this-won't-work-let's-try-that-instead cycles, but I believe I've nailed basic functionality well enough to post it here.. so, once again, I'll be happy to hear any comments you folks can provide
      https://modules.processwire.com/modules/process-changelog/
      https://github.com/teppokoivula/ProcessChangelog
      How does it work?
      Exactly like it's (sort of) predecessor, Process Changelog actually consists of two modules: Process Changelog and Process Changelog Hooks. Hooks module exists only to serve main module by hooking into various functions within Pages class, collecting data of performed operations, refining it and keeping up a log of events in it's own custom database table (process_changelog.) Visible part is managed by Process Changelog, which provides users a (relatively) pretty view of the contents of said log table.
      How do you use it?
      When installed this module adds new page called Changelog under Admin > Setup which provides you with a table view of collected data and basic filtering tools See attached screenshots to get a general idea about what that page should look like after a while.
      For detailed installation instructions etc. see README.md.
       


    • By Gadgetto
      Status update links (inside this thread) for SnipWire development will be always posted here:
      2019-08-08
      2019-06-15
      2019-06-02
      2019-05-25
      If you are interested, you can test the current state of development:
      https://github.com/gadgetto/SnipWire
      Please note that the software is not yet intended for use in a production system (alpha version).
      If you like, you can also submit feature requests and suggestions for improvement. I also accept pull requests.
      ---- INITIAL POST FROM 2019-05-25 ----
      I wanted to let you know that I am currently working on a new ProcessWire module that fully integrates the Snipcart Shopping Cart System into ProcessWire. (this is a customer project, so I had to postpone the development of my other module GroupMailer).
      The new module SnipWire offers full integration of the Snipcart Shopping Cart System into ProcessWire.
      Here are some highlights:
      simple setup with (optional) pre-installed templates, product fields, sample products (quasi a complete shop system to get started immediately) store dashboard with all data from the snipcart system (no change to the snipcart dashboard itself required) Integrated REST API for controlling and querying snipcart data webhooks to trigger events from Snipcart (new order, new customer, etc.) multi currency support self-defined/configurable tax rates etc. Development is already well advanced and I plan to release the module in the next 2-3 months.
      I'm not sure yet if this will be a "Pro" module or if it will be made available for free.
      I would be grateful for suggestions and hints!
      (please have a look at the screenshots to get an idea what I'm talking about)
       




×
×
  • Create New...