Jump to content

Import pages/files from XML files


cb2004
 Share

Recommended Posts

Just got into Processwire and I am loving it so far, I am ready to start doing some more funky stuff now I have a basic understanding of it.

At present I have a site where the client uploads XML files and image files to a directory they have FTP control over. Within their current CMS there is a link in admin to a PHP file which processes the XML files and puts data into a custom table outside of the CMS tables (therefore certain things cannot be controlled properly/well) so are not visible in admin.

I have dived into the API and pretty sure this should be possible, and maybe even pretty easy, but where should I start? Creating a parent with a template, and then a template for the children with all the correct fields that I need?

  • Like 1
Link to comment
Share on other sites

You left out the part how you actually want to handle the data now.

If you want them to get simple pages in processwire, than yeah, you'd build the templates to hold your data and then import the data as pages. For the importing part take a look at some of these: http://modules.processwire.com/categories/import-export/

If you need some more basic things for starting out, here you can find how you'd generate a page from the api: https://processwire.com/talk/topic/352-creating-pages-via-api/

  • Like 1
Link to comment
Share on other sites

Yes, my thinking was to make them pages, purely from an SEO point of view so that we can edit the titles and descriptions. The rest of the data may not even need to be changed, but that can be locked down when setting up the fields (I have seen these options before I think).

Now regarding running this thing, I have looked at the ProcessHello module and I can run a function from that after it sets up a page. Is that the best way? Seems too easy so far.

Link to comment
Share on other sites

Process modules are the type of module, that integrates as ui into the admin backend, so if you can make it do what you want than it's perfectly fine. Sometimes people do split up the processing in a separate module so it's easier usable from the ui (process module) as well as from the api, but it's not necessary to do that and maybe overkill depending on the needed featureset.

Link to comment
Share on other sites

Well I have got it checking for the XML file in a sub folder of my templates folder, if it finds it you can execute the module and basically all it is going to do is read the XML file and play with the API to add pages. Just wondering how to check for changes between the XML file and what is in the database. Hopefully after I have stopped typing this I will check and the XML file will have a modified date.

Link to comment
Share on other sites

This really depends on your usecase. If both places should stay in sync than it may be a bit of work. If you only want to import changes in the xml file and the imported pages are not changed in the admin backend, than it's quite an easy job.

Link to comment
Share on other sites


OK, only needed a short amount of time to get to where I am so I am really seeing the speed of Processwire development.

The code below is adding pages just fine, my next step would be to check when this is run again if the page is already in the database, then do an update, if it isn't then add. As you will see the XML file contains a Refnumber which I add to Processwire so we can match stuff up.

public function ___executeImport() {

	Wire::setFuel('processHeadline', 'Complete'); 

	$out = '<h2>Properties import</h2>';

	$xml = wire('config')->paths->templates.DIRECTORY_SEPARATOR.'uploads'.DIRECTORY_SEPARATOR.'properties.xml';

	if(file_exists($xml)) {

		$this->message("The XML file has been found");
		$xml = simplexml_load_file($xml);

		foreach($xml->Properties->Property as $property) {
				$page = new Page();
				$page->template = 'basic-page';
				$page->parent = $this->pages->get(1017);
				$page->name = $property->Number . $property->Street . $property->Address3;
				$page->title = $property->Number. ' ' .$property->Street. ', ' .$property->Address3;
				$page->property_ref = intval($property->Refnumber);
				$page->save(); 
		}

	} else {
		$this->error("The XML file cannot be found");
	}
}

I am guessing I need to make a start by getting the pages:

$properties = wire('pages')->get(1017)->children()->find("template=basic-page");

And then checking the property_ref field.

Any guidance of the best way of doing this would be great. 

Link to comment
Share on other sites

This should be enough:

$ref = intval($property->Refnumber);
$existing = wire('pages')->get("template=basic-page, property_ref=$ref");

if (!$existing->id) // you're good to go
  • Like 1
Link to comment
Share on other sites

As I understand you are creating a page per property. In that case yes, straight in the foreach. Unlike find(), get() stops immediately after finding the first occurrence that matches the selector and keeps only that one object in memory, so it's very light and surprisingly fast.

Edit: as for querying the db every time, I don't see a better solution. The alternative would be to loop through all the properties in the XML to collect the references, query the database to check which ones are still not there, collect them, and loop again the XML only using those... Not sure if this would be more effective.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...