Jump to content

Parsing XML to Processwire


formmailer
 Share

Recommended Posts

Hi,

While converting a site to PW I ran into two (slightly similar) issues that I need to solve before I can launch the PW version.


1. Inserting a value from a XML field into the body

On a page about I need to display the current currency rate of the Swedish crown into the body field.

The current rate should be retrieved from a XML feed of the European Central Bank:

http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml

The result should be something like:

  this is the body text

  .....

  .....

  The current currency rate: 100 Swedish Crowns is 11,59 Euro

  .....

  .....

  The body text ends here.

What would be the best way to accomplish this?


2. Publish a RSS feed (the page should be a reader

I need to publish the RSS of another site (Wordpress) on my page, something like this:

  Latest news by www.otherwebsite.com

  Article 1

  Summary of the article, with an image if the feed offers one.

  Read more...

  Article 2

  Summary of the article, with an image if the feed offers one.

  Read more...

  Article 3

  Summary of the article, with an image if the feed offers one.

  Read more...

This feed is also an XML feed of course.

Any suggestions on how to accomplish this?


I used to use some external php-scripts to do both of the above, but that doesn't seem right anymore. Especially the rss feed to be displayed should go into the PW database, probably each article as an own page.

Maybe the currency rate could also be added as a page....

I hope that you have some ideas on this....

/Jasper

Link to comment
Share on other sites

Your question number 2. is well covered with this RSS Feed Loader -module by Ryan: http://processwire.com/talk/index.php/topic,503.0.html

Your first question... I think there are million ways to go, but first thing that got into my mind would be something like this: use some simple "tag" in your bodytext. Let's say it would be like this:

{CUR=SEK}

Then you would parse your bodyfield before outputting it, looking for {CUR=???} tags. I think best way here would be regexp which looks for words which starts with { and end with } and have 7 chars between (you could check for = char also just in case). I'm not very good with regexp, so I leave that part for you :) It shouldn't be hard though.

Then you would have php script which does the actual XML parsing. This can be actually PW template and page if you want to. It could be hidden page like /currencies/ and template would be something like this:

<?php
// Written in browser, no error checking etc...
// Should output rate for desired currency

$cur = $input->get->cur;
$url = "http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml";

$xml = simplexml_load_string(file_get_contents($url));

// This is most probably not right, written in browser and haven't used simpleXML in ages..
$value = $xml->xpath("gesmes:Envelope/Cube/Cube/Cube[@currency=$cur]")->attributes()->rate;

echo $value;

And back to your actual template. When you parse and find {CUR=???} tags, then you will do something like this in your template:

<?php
$tag = "{CUR=SEK}"; // This (or these if multiple required?) comes from regexp
$cur = "SEK"; //this actually comes from {CUR=SEK}
$value = get_file_contents("/currencies?cur=$cur");
$output = "The current currency rate: 100 Swedish Crowns is $value Euro";
$page->body = str_replace($tag, $output, $page->body);

Here I would use MarkupCache -module so that it wouldn't hit other server on each page load, only once a day or each hour or so.

Sorry if this came out little messy, written in rush, but hopely it will help you forward. Also - I am not sure if this is most elegant way to solve this (just first thing that came into my mind), so others please dive in with other solutions too.

Link to comment
Share on other sites

Thanks for your reply Antti!

Your question number 2. is well covered with this RSS Feed Loader -module by Ryan: http://processwire.com/talk/index.php/topic,503.0.html

Wow, I totally missed this module. It's exactly what I need.

Your reply regarding my first question is very helpful. I'll start playing with it.

/Jasper

Link to comment
Share on other sites

I got it working!  :)

I would actually build simple module in your first question:

$modules->MarkupCurrencies->getCurrency("SEK");

Good idea! I just did this.

The core function:

<?php 
public function getCurrency($cur) {
	$xml = simplexml_load_file("http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml");
	$xml->registerXPathNamespace("ecb", "http://www.ecb.int/vocabulary/2002-08-01/eurofxref");
	$array = $xml->xpath("//ecb:Cube[@currency='".$cur."']/@rate");
	$rate = (string) $array[0]['rate'];
	$time_array = $xml->xpath("//ecb:Cube/@time");
	$time = (string) $time_array[0]['time'];
	return array ($time, $rate);
}

Below is the template part:

<?php
$tag = "{CUR=SEK}";
$cache = $modules->get("MarkupCache"); 
if(!$output = $cache->get("output",86400)) {
$array = $modules->MarkupCurrencies->getCurrency('SEK');
$rate = round(100/$array[1],2);
$time = strtotime($array[0]);
$output = "100 Zweedse kroon (SEK) is ongeveer $rate Euro (EUR)<br />  
<span class='small-text italic'>(laatst bijgewerkt op: ". date('d-m-Y',$time) .")</span>";
$cache->save($output); 
} 
$body = str_replace($tag, $output, $body);

The cache part makes sure that the parsing only happens once every 24 hours to avoid high server load and reduce loading times. But the str_replace part is running on all pages using this general template. Would it be better to have a specific template for this page?

/Jasper

Link to comment
Share on other sites

Great stuff Jasper.

If you have something else how you can figure when that str_replace is needed, then you could just if based on that. Probably little bit easier to maintain than separate template just for that.

But I don't think that simple str_replace is causing any problems for your site performance. Not sure though.

Link to comment
Share on other sites

But I don't think that simple str_replace is causing any problems for your site performance. Not sure though.

I agree, this is not something you need to worry about. Some unoptimized preg_replace's might be something to worry about, but not str_replace on any reasonable length of content. 

Link to comment
Share on other sites

  • 2 months later...

I noticed a problem regarding this module: when the site containing the xml file is offline when it's time to update the cache, my whole site crashes (Error Call to a member function registerXPathNamespace() on a non-object (line 66 of D:\xampp\htdocs\zwedenweb\site\modules\MarkupCurrencies.module) .

And of course this happened the same day I launched the site. :'(

Somehow I need to build in some kind of error handling, but I don't know how. It would be okay if the template part isn't executed when the ecb.int site is unavailable.

Any suggestions on how to do this?

Thanks in advance!

/Jasper

Link to comment
Share on other sites

Hello formmailer,

As exchange rates usually vary quite slowly, how about just caching the last value you successfully get each time and fall back to that if a read fails. You could also have your module email you if reading fails a number of times.

Edited to add: This might not be acceptable for some applications that may require a certain amount of 'freshness' to the data or to data from sites where the TOS forbid caching.

Edited by netcarver
Link to comment
Share on other sites

Formmailer you should be able to do something like this (needs allow_url_fopen php server setting to be on), but I guess the rss module also uses it so it will work. Simply use this in the template where you markup cache or xml code is to check if url is available, if not it returns false, the @ is just to surpress any error it would throw.

if($file = @file_get_contents("http://domain.com/the/path/to/xml")) {  // or @simplexml_load_file
echo $file;
} else {
// do something else
echo "not available";
}
  • Like 1
Link to comment
Share on other sites

Thanks Soma and Netcarver.

I think everything is fine now.

I used Soma's solution and added a 5 second timeout.

public function getCurrency($cur) {
 // Create the stream context
 $context = stream_context_create(array(
 'http' => array('timeout' => 5  // Timeout in seconds
 )
 ));

 if($file = @file_get_contents("http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml",0,$context)) {
	$xml = simplexml_load_string($file);
 }
 if (!$xml) {
} else {
 $xml->registerXPathNamespace("ecb", "http://www.ecb.int/vocabulary/2002-08-01/eurofxref");
 $array = $xml->xpath("//ecb:Cube[@currency='".$cur."']/@rate");
 $rate = (string) $array[0]['rate'];
 $time_array = $xml->xpath("//ecb:Cube/@time");
 $time = (string) $time_array[0]['time'];
 return array ($time, $rate);
 }
}

And in the template I use the following:

<? $tag = "{CUR=SEK}";
$cache = $modules->get("MarkupCache");
if(!$output = $cache->get("output",86400)) {
$array = $modules->MarkupCurrencies->getCurrency('SEK');
if (!$array){$output=" ";} else {
$rate = round(100/$array[1],2);
$time = strtotime($array[0]);
$output = "100 Zweedse kroon (SEK) is ongeveer $rate Euro (EUR)<br />
<span class='small-text italic'>(laatst bijgewerkt op: ". date('d-m-Y',$time) .")</span>";}
$cache->save($output);
}
$body = str_replace($tag, $output, $body);
?>

I think this is will solve the problem. But just to be at the safe site, I added an additional if-statement, so that this code only runs on the few pages that actually need it.

/Jasper

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...