ryan Posted September 19, 2011 Share Posted September 19, 2011 ProcessWire RSS Feed Loader Given an RSS feed URL, this module will pull it, and let you foreach() it or render it. This module will also cache feeds that you retrieve with it. The module is designed for ProcessWire 2.1+, but may also work with 2.0 (haven't tried yet). This module is the opposite of the MarkupRSS module that comes with ProcessWire because that module creates RSS feeds. Whereas this module loads them and gives you easy access to the data to do whatever you want. For a simple live example of this module in use, see the processwire.com homepage (and many of the inside pages) for the "Latest Forum Post" section in the sidebar. Download at: https://github.com/r...n/MarkupLoadRSS REQUIREMENTS This module requires that your PHP installation have the 'allow_url_fopen' option enabled. By default, it is enabled in PHP. However, some hosts turn it off for security reasons. This module will prevent itself from being installed if your system doesn't have allow_url_fopen. If you run into this problem, let me know as we may be able to find some other way of making it work without too much trouble. INSTALLATION The MarkupLoadRSS module installs in the same way as all PW modules: 1. Copy the MarkupLoadRSS.module file to your /site/modules/ directory. 2. Login to ProcessWire admin, click 'Modules' and 'Check for New Modules'. 3. Click 'Install' next to the Markup Load RSS module. USAGE The MarkupLoadRSS module is used from your template files. Usage is described with these examples: Example #1: Cycling through a feed <?php $rss = $modules->get("MarkupLoadRSS"); $rss->load("http://www.di.net/articles/rss/"); foreach($rss as $item) { echo "<p>"; echo "<a href='{$item->url}'>{$item->title}</a> "; echo $item->date . "<br /> "; echo $item->description; echo "</p>"; } Example #2: Using the built-in rendering <?php $rss = $modules->get("MarkupLoadRSS"); echo $rss->render("http://www.di.net/articles/rss/"); Example #3: Specifying options and using channel titles <?php $rss = $modules->get("MarkupLoadRSS"); $rss->limit = 5; $rss->cache = 0; $rss->maxLength = 255; $rss->dateFormat = 'm/d/Y H:i:s'; $rss->load("http://www.di.net/articles/rss/"); echo "<h2>{$rss->title}</h2>"; echo "<p>{$rss->description}</p>"; echo "<ul>"; foreach($rss as $item) { echo "<li>" . $item->title . "</li>"; } echo "</ul>"; OPTIONS Options MUST be set before calling load() or render(). <?php // specify that you want to load up to 3 items (default = 10) $rss->limit = 3; // set the feed to cache for an hour (default = 120 seconds) // if you want to disable the cache, set it to 0. $rss->cache = 3600; // set the max length of any field, i.e. description (default = 2048) // field values longer than this will be truncated $rss->maxLength = 255; // tell it to strip out any HTML tags (default = true) $rss->stripTags = true; // tell it to encode any entities in the feed (default = true); $rss->encodeEntities = true; // set the date format used for output (use PHP date string) $rss->dateFormat = "Y-m-d g:i a"; See the $options array in the class for more options. You can also customize all output produced by the render() method, though it is probably easier just to foreach() the $rss yourself. But see the module class file and $options array near the top to see how to change the markup that render() produces. MORE DETAILS This module loads the given RSS feed and all data from it. It then populates that data into a WireArray of Page-like objects. All of the fields in the RSS <items> feed are accessible, so you use whatever the feed provides. The most common and expected field names in the RSS channel are: $rss->title $rss->pubDate (or $rss->date) $rss->description (or $rss->body) $rss->link (or $rss->url) $rss->created (unix timestamp of pubDate) The most common and expected field names for each RSS item are: $item->title $item->pubDate (or $item->date) $item->description (or $item->body) $item->link (or $item->url) $item->created (unix timestamp of pubDate) For convenience and consistency, ProcessWire translates some common RSS fields to the PW-equivalent naming style. You can choose to use either the ProcessWire-style name or the traditional RSS name, as shown above. HANDLING ERRORS If an error occurred when loading the feed, the $rss object will have 0 items in it: <?php $rss->load("..."); if(!count($rss)) { error } In addition, the $rss->error property always contains a detailed description of what error occurred: <?php if($rss->error) { echo "<p>{$rss->error}</p>"; } I recommend only checking for or reporting errors when you are developing and testing. On production sites you should skip error checking/testing, as blank output is a clear indication of an error. This module will not throw runtime exceptions so if an error occurs, it's not going to halt the site. 1 Link to comment Share on other sites More sharing options...
Soma Posted September 19, 2011 Share Posted September 19, 2011 Great to have such a module, thanks Ryan! Though can't install: Warning: mkdir() [function.mkdir]: No such file or directory in /Applications/XAMPP/xamppfiles/htdocs/pw2.ch/site/modules/MarkupLoadRSS.module on line 447 Link to comment Share on other sites More sharing options...
ryan Posted September 19, 2011 Author Share Posted September 19, 2011 Oops, looks like I messed up something last minute. Just committed the fix. Thanks for letting me know. Link to comment Share on other sites More sharing options...
Soma Posted September 19, 2011 Share Posted September 19, 2011 Thanks, just installed and tested a run and works great so far! 8) Link to comment Share on other sites More sharing options...
almonk Posted September 19, 2011 Share Posted September 19, 2011 Looks brilliant Ryan! Will try it out asap. Link to comment Share on other sites More sharing options...
apeisa Posted September 20, 2011 Share Posted September 20, 2011 Great work Ryan! Only thing I might add is support for multiple feeds. Though it might complicate this module too much? Link to comment Share on other sites More sharing options...
formmailer Posted December 28, 2011 Share Posted December 28, 2011 I want to display a RSS feed that contains items like below and it works well, except for the author field (dc:creator), which isn't parsed. Is there a way to parse this value as well? <item> <title>Taalkundigen Uppsala ontcijferen geheimschrift</title> <link>http://www.wereldwijzerzweden.net/2011/11/03/uppsala-geheimschrift-taalkundige-copiale/</link> <comments>http://www.wereldwijzerzweden.net/2011/11/03/uppsala-geheimschrift-taalkundige-copiale/#comments</comments> <pubDate>Thu, 03 Nov 2011 16:03:16 +0000</pubDate> <dc:creator>Marcel Burger</dc:creator> <category><![CDATA[Actueel]]></category> <category><![CDATA[berlijn]]></category> <category><![CDATA[Copiale]]></category> <category><![CDATA[geheimschrift]]></category> <category><![CDATA[universiteit]]></category> <category><![CDATA[uppsala]]></category> <guid isPermaLink="false">http://www.wereldwijzerzweden.net/?p=7227</guid> <description><![CDATA[<a href="http://www.wereldwijzerzweden.net/2011/11/03/uppsala-geheimschrift-taalkundige-copiale/"><img align="left" hspace="5" width="150" src="http://www.wereldwijzerzweden.net/images/copiale_280.jpg" class="alignleft wp-post-image tfe" alt="Deel uit vrijgegeven beeld van het Copialeschrift" title="copiale_280.jpg" /></a>3 november 2011 | Twee Zweedse taalkundigen en een Amerikaanse wetenschapper zijn erin geslaagd een 280 jaar oud geheimschrift uit Duitsland met voorheen onbegrijpelijke tekens te vertalen.]]></description> <wfw:commentRss>http://www.wereldwijzerzweden.net/2011/11/03/uppsala-geheimschrift-taalkundige-copiale/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> I outputted the $rss array with print_r(); and it doesn't contain the dc:creator field (some others seem to be missing as well, but I don't need these ) /Jasper Link to comment Share on other sites More sharing options...
Pete Posted December 28, 2011 Share Posted December 28, 2011 Just a guess, but do the other fields it's nor parsing contain a colon as well? Link to comment Share on other sites More sharing options...
formmailer Posted December 28, 2011 Share Posted December 28, 2011 No, for example comments didn't get parsed as well. /Jasper Link to comment Share on other sites More sharing options...
ryan Posted December 28, 2011 Author Share Posted December 28, 2011 If I recall correctly, SimpleXML doesn't work with the properties that have colons in them. But you can fix that by replacing the colon properties with underscore properties in the XML data. So in this case, you'd want to add this line in the load() function: <?php public function load($url) { $this->items = new WireArray(); $xmlData = $this->loadXmlData($url); $xmlData = str_replace('dc:creator', 'dc_creator', $xmlData); Or you may be able to cover all the colon properties at once using a regexp like this: <?php $xmlData = preg_replace('{(</?[_a-z0-9]+)[_a-z0-9]+>)}', '$1_$2', $xmlData); What that does is convert properties like <dc:creator> to <dc_creator> so that SimpleXML will understand them and likewise you can access them in the module. Let me know if this works for you. I'm not in a place where I can update the source on this module today, but will plan to add something like the above soon. I don't know why the <comments> property wouldn't be getting parsed, as that appears to just be a string (URL). I need to test and experiment with that one to find out why. Link to comment Share on other sites More sharing options...
formmailer Posted December 28, 2011 Share Posted December 28, 2011 Thanks Ryan, replacing the colons work, both with the str_replace and the regexp.. I'm not in a place where I can update the source on this module today, but will plan to add something like the above soon. I also submitted (via Github) a double encoding issue (I am good in finding these ) in this module. You might want to take a look at that one at the same time. I don't know why the <comments> property wouldn't be getting parsed, as that appears to just be a string (URL). I need to test and experiment with that one to find out why. My fault , the comments property is parsed. One that didn't get parsed was the Category, but that may be because it appears multiple times. (guess). The exact feed I am using is also in the Github issue, so you can test with it if you want/like. /Jasper Link to comment Share on other sites More sharing options...
ryan Posted December 29, 2011 Author Share Posted December 29, 2011 Thanks for submitting the issue, I will fix. Also I'd like to find a way to get Comments (and any multi-item properties) working as well, should be easy. The feeds I'd originally tested with were pretty basic and didn't have these extended properties. Link to comment Share on other sites More sharing options...
apeisa Posted May 24, 2012 Share Posted May 24, 2012 Great work Ryan! Only thing I might add is support for multiple feeds. Though it might complicate this module too much? I had need for multiple feeds and it seemed to be pretty straightforward implementation. Only few modifications to load method: public function load($url) { $this->items = new WireArray(); if (is_array($url)) { $items = array(); foreach ($url as $feed) { $xmlData = $this->loadXmlData($feed); $xml = simplexml_load_string($xmlData); $items = array_merge($items, $xml->xpath('/rss//item')); } $rss = simplexml_load_string($xmlData); } else { $xmlData = $this->loadXmlData($url); $rss = simplexml_load_string($xmlData); } if(!$rss) { $msg = "Unable to load RSS feed at " . htmlentities($url) . ": \n"; foreach(libxml_get_errors() as $error) $msg .= $error . " \n"; $this->error($msg); return $this; } $this->channel['title'] = $this->cleanText((string) $rss->channel->title); $this->channel['description'] = $this->cleanText((string) $rss->channel->description); $this->channel['link'] = $this->cleanText((string) $rss->channel->link); $this->channel['created'] = strtotime((string) $rss->channel->pubDate); $this->channel['pubDate'] = date($this->options['dateFormat'], $this->channel['created']); $n = 0; // If we already have $items set, it means we are dealing with multiple sources. Let's sort them if(isset($items)) { usort($items, function ($x, $y) { return strtotime($y->pubDate) - strtotime($x->pubDate); }); } else { $items = $rss->channel->item; } foreach($items as $item) { $a = new MarkupLoadRSSItem(); foreach($item as $key => $value) { $value = (string) $value; if($key == 'pubDate') { $value = strtotime($value); $a->set('created', $value); $value = date($this->options['dateFormat'], $value); } else { $value = $this->cleanText($value); } $a->set($key, $value); } $this->items->add($a); if(++$n >= $this->options['limit']) break; } return $this; } What it does it sniffs if $url is array, then loads/caches all those and merge their rss-items to $items array. Then later on that $items is sorted by pubDate. So this is fully backwards compatible => just give it an array instead of single url if you need to parse multiple feeds. If you guys can test it works for you too then maybe Ryan you can put this on your version. I can do pull request if you want to (although it seems that new and fancy GitHub for windows does mess up line endings..). 1 Link to comment Share on other sites More sharing options...
apeisa Posted June 15, 2012 Share Posted June 15, 2012 For some reason this module doesn't parse this RSS: http://www.ttl.fi/fi/tiedotteet/Sivut/Rss.aspx W3 validator says it is fine (although a lot of recommendations offered): http://validator.w3.org/appc/check.cgi?url=http%3A%2F%2Fwww.ttl.fi%2Ffi%2Ftiedotteet%2FSivut%2FRss.aspx Not sure what goes wrong and where..? Link to comment Share on other sites More sharing options...
ryan Posted June 18, 2012 Author Share Posted June 18, 2012 I'm not sure that the W3 validator is picking it up right either? Seems like it is showing the whole thing as double entity encoded. Also tried loading in Safari, and it can't seem to read the feed correctly either. Firefox seems okay. Definitely something unusual going on with this feed, but I am not familiar enough with this particular format to know what's wrong. W3 validator isn't helping much since it's seeing the whole thing as double entity encoded. Link to comment Share on other sites More sharing options...
apeisa Posted September 11, 2012 Share Posted September 11, 2012 Yeah, there definitely was some strange going on with that feed. Now it seems to be working on my end too, so they must have been fixed that. Ryan: have you thought about adding that multisource functionality to this module? I am already using it in couple of places, and it has been working great. Of course if you think the implementation should be different or alltogether different module then let me know (or if you prefer github pull request). What I was thinking it might be more "pw" to have add->(source_url) etc and then load, instead of having all the urls in array load($array_of_urls) like it is currently. Link to comment Share on other sites More sharing options...
ryan Posted September 11, 2012 Author Share Posted September 11, 2012 Sounds good! Lets go for a pull request if it's convenient for you. Or just post the .module file and I can go through the diff. Link to comment Share on other sites More sharing options...
evanmcd Posted January 17, 2013 Share Posted January 17, 2013 Hi Ryan, Thanks for this module. Have been using it on our main site for awhile now. Just wanted to let you know of an issue that I just discovered that others may run into, and see if there's a way to handle it. I was trying to load a feed that for awhile was not responding. The feed page wasn't throwing an error or even timing out, just loading for minutes on end. This ended up causing a timeout on our site (the feed was loading on the main page) and producing this error in the PW log file: Error Exception: MySQL server has gone away (in /mnt/stor7-wc2-dfw1/526843/www.agencypja.com/web/content/wire/core/Database.php line 118) For now, we've just disabled that feed, but we are using the module to load other feeds. Do you (or anyone else) know of a way to address this issue? I don't see a timeout option in the module, but could certainly look into adding one if that determined to be the best option. Thanks. Link to comment Share on other sites More sharing options...
ryan Posted January 18, 2013 Author Share Posted January 18, 2013 Could you PM me the RSS feed you are working with? I can do some testing here. I believe we can get it working by switching MarkupLoadRSS to use the new WireHttp class in PW 2.3.10+, but I need an example to test with. Link to comment Share on other sites More sharing options...
evanmcd Posted January 22, 2013 Share Posted January 22, 2013 Hi Ryan, Unfortunately, the feed that was causing problems is now back up and running normally. I thought that I could recreate the issue by creating a php page on another server with a timeout set to at least 5 min, sleeping the script, and using that as the RSS feed, but that didn't work. I'll be sure to let you know if I ever come across is again. Thanks. 1 Link to comment Share on other sites More sharing options...
suntrop Posted June 18, 2014 Share Posted June 18, 2014 I am trying to load a RDF based feed http://www.bundestag.de/rss_feeds/gesundheit.rss Seems the module can't parse that feed. I haven't worked with XML a lot – maybe someone can give me a jumpstart where to start, so I can parse the feed above? Link to comment Share on other sites More sharing options...
alec Posted October 12, 2014 Share Posted October 12, 2014 Is it possible to load image with this module? Link to comment Share on other sites More sharing options...
kreativmonkey Posted September 15, 2015 Share Posted September 15, 2015 I get always empty RSS Feed output! On my Page i call the RSS module by url selector like blog/rss and for the output i need the same page array that i use for the /blog page. But in the RSS Feed i get no content! $blogposts = $pages->find("template=post, publish_date<$today, sort=-publish_date, limit=10"); if($input->urlSegment1 === 'rss'){ // retrieve the RSS module $rss = $modules->get("MarkupRSS"); // configure the feed. see the actual module file for more optional config options. $rss->title = "Letzte Blogeinträge"; $rss->render($blogposts); return; } else { $content = renderPosts($blogposts, true); } Link to comment Share on other sites More sharing options...
bombemedia Posted January 14, 2016 Share Posted January 14, 2016 (edited) I just downloaded MarkupLoadRSS module from M.Cramer's Github repo. Here is the demo code I used for test purpose in my template : $rss = $modules->get("MarkupLoadRSS"); $rss->load("http://rss.cbc.ca/lineup/canada.xml"); foreach($rss as $item) { echo "<p>"; echo "<a href='{$item->url}'>{$item->title}</a> "; echo $item->date . "<br /> "; echo $item->description; echo "</p>"; } All I get is this error : Error: Call to a member function load() on a non-object (line 65 of C:\wamp\www\mysite\site\templates\home.php) As if the module wants an object, like a $page or $config or something... I looked up the code and the function is load($url). Would it be conflicting with something ? I'm running v2.7.2 Edited January 14, 2016 by kongondo merged your topic here, the module's support forum Link to comment Share on other sites More sharing options...
BitPoet Posted January 14, 2016 Share Posted January 14, 2016 The error says that $rss is not an object, so the call $modules->get doesn't return a module instance. Did you install the module after downloading it? Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now