Jump to content
aaronjpitts

XML feed importer to map to new pages and fields

Recommended Posts

Hi Guys,

I have just started to experiment with Processwire and really like what I see so far. I am hoping to try my first production site on it soon. I have long been a wordpress user, but want something new and more efficient. I am wondering if there is a module/feature similar to this for wordpress: http://www.wpallimport.com/ it's a full XML importer where you can import to create new pages (posts on wordpress) and map parts of each entry to certain fields of each page. Is this possible for Processwire? I use this with the advacned custom field plugin on wordpress to great effect. Each time the plugin is run, it will update the posts if they have already been created, and add new ones for new entries in the XML feed.

I've seen this module, but does it work with the latest version, and could it work with what I need: http://modules.processwire.com/modules/process-data-import/

If anyone can point me in the right direction I would appreciate it.

Thanks,

Aaron

Share this post


Link to post
Share on other sites

Hi @aaronjpitts and welcome!

A few tools that might help you:

CSV importer: http://modules.processwire.com/modules/import-pages-csv/

Table CSV Import/Export http://modules.processwire.com/modules/table-csv-import-export/

Wordpress Migrator https://github.com/NicoKnoll/MigratorWordpress

Migrator (json export and import) https://github.com/adrianbj/ProcessMigrator

Please let us know if you have any specific questions about any of these.

  • Like 1

Share this post


Link to post
Share on other sites

Hi Adrian,

Thanks for your suggestions, but I don't think those modules could work to import data from an XML feed, right? This is the feed I need to import from: https://services.boatwizard.com/bridge/events/1f163739-f2a4-45fe-8274-0302f30a2f7d/boats?status=on

It would be to import boats, each boat having several fields such as price, length, location etc

Many thanks

Share this post


Link to post
Share on other sites

Just to make this clear, you want a way to be able to continuously import from that source and not a one-time thing?

  • Like 2

Share this post


Link to post
Share on other sites

Yes, that's correct. Because the feed often updates with new entries (boats) being added, and old (sold) boats being removed.

Thanks

Good to know!

I think it might be worth you creating a converter that can be run via a cron job. 

Sorry I don't have time for a detailed example, so this might not be much use, but you can use one of the PHP xml to array (or maybe json) functions and then use the PW API to convert that into pages. It won't be terribly difficult, but not exactly trivial. There are several bits of code that might help you if you decide to tackle this - some from the Wordpress Migrator (xml to json) and Migrator (json to PW pages).

If you're willing to have a go, I am sure we can help you get through any roadblocks that come up.

  • Like 2

Share this post


Link to post
Share on other sites

Yipieh! Looks like a task for an importer script! :)

The XML has <DocumentID>123456</DocumentID> what seems to be the boat-article-ids. So, you can go with it manually via importer script, or first have a look for getting a PHP conversion lib that converts xml to csv and then maybe use the CSV module for importing.

But it is also very easy done manually with an importer script, (bootstrapping PW, running via cron or via PWs lazyCron). If you need any further explanation, please ask here.

EDIT: @Adrian beats me a few seconds :)

Edited by horst
  • Like 2

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Mike Rockett
      Docs & Download: rockettpw/markup-sitemap
      Modules Directory: MarkupSitemap
      Composer: rockett/sitemap
      MarkupSitemap is essentially an upgrade to MarkupSitemapXML by Pete. It adds multi-language support using the built-in LanguageSupportPageNames. Where multi-language pages are available, they are added to the sitemap by means of an alternate link in that page's <url>. Support for listing images in the sitemap on a page-by-page basis and using a sitemap stylesheet are also added.
      Example when using the built-in multi-language profile:
      <url> <loc>http://domain.local/about/</loc> <lastmod>2017-08-27T16:16:32+02:00</lastmod> <xhtml:link rel="alternate" hreflang="en" href="http://domain.local/en/about/"/> <xhtml:link rel="alternate" hreflang="de" href="http://domain.local/de/uber/"/> <xhtml:link rel="alternate" hreflang="fi" href="http://domain.local/fi/tietoja/"/> </url> It also uses a locally maintained fork of a sitemap package by Matthew Davies that assists in automating the process.
      The doesn't use the same sitemap_ignore field available in MarkupSitemapXML. Rather, it renders sitemap options fields in a Page's Settings tab. One of the fields is for excluding a Page from the sitemap, and another is for excluding its children. You can assign which templates get these config fields in the module's configuration (much like you would with MarkupSEO).
      Note that the two exclusion options are mutually exclusive at this point as there may be cases where you don't want to show a parent page, but only its children. Whilst unorthodox, I'm leaving the flexibility there. (The home page cannot be excluded from the sitemap, so the applicable exclusion fields won't be available there.)
      As of December 2017, you can also exclude templates from sitemap access altogether, whilst retaining their settings if previously configured.
      Sitemap also allows you to include images for each page at the template level, and you can disable image output at the page level.
      The module allows you to set the priority on a per-page basis (it's optional and will not be included if not set).
      Lastly, a stylesheet option has also been added. You can use the default one (enabled by default), or set your own.
      Note that if the module is uninstalled, any saved data on a per-page basis is removed. The same thing happens for a specific page when it is deleted after having been trashed.
          
    • By stanoliver
      My aim is to output a very basic xml document which should be styled with a few css-styles.
      <?xml version = "1.0"?> <contact-info> <name>Donal Duck</name> <company>Superducks</company> <phone>(011) 123-4567</phone> </contact-info> How do I implement it with processwire?
    • By louisstephens
      So I was tinkering around with the "select fields" field type and added it to a repeater. My thoughts were I could have a user select a field (textarea, text, etc etc) that I defined and give it a name (another field in the repeater) and create their own form on the page. To be honest, I am now a little lost with rendering the form and mailing the results as potentially the form will be unique and custom every time.  The only way I know to handle the output is by going about it this way:
      $forms = $page->form_select_fields; foreach($forms as $form) { if($form->name === "form_input") { //output input with custom name } elseif($form->name === "form_textarea") { //output input with custom name } } Is there a better way to go about rendering the elements from the repeater? As far as the custom sending goes, I am really at a loss since it would be pretty dynamic. Has anyone used this type of approach, and if so, how did you handle this without going insane?
    • By NorbertH
      I have trouble exporting fields via the buildin export. 
      For example when i export a single field  i get:
      { "bestellung_status_name": { "id": 194, "type": "FieldtypeText", "flags": 0, "name": "bestellung_status_name", "label": "Status Intern", "textformatters": [ "TextformatterEntities" ], "collapsed": 0, "minlength": 0, "maxlength": 100, "showCount": 0, "size": 0, "pattern": "[a-z\\A-Z\\(\\)]+", "showIf": "", "themeInputSize": "", "themeInputWidth": "", "themeOffset": "", "themeBorder": "", "themeColor": "", "themeBlank": "", "columnWidth": 100, "required": "", "requiredAttr": "", "requiredIf": "", "stripTags": "", "placeholder": "" } } exporting a secon single field i get :
      { "bestellung_status_name_ext": { "id": 195, "type": "FieldtypeText", "flags": 0, "name": "bestellung_status_name_ext", "label": "Status Extern", "textformatters": [ "TextformatterEntities" ], "collapsed": 0, "minlength": 0, "maxlength": 100, "showCount": 0, "size": 0, "pattern": "[a-z\\A-Z\\(\\)]+", "showIf": "", "themeInputSize": "", "themeInputWidth": "", "themeOffset": "", "themeBorder": "", "themeColor": "", "themeBlank": "", "columnWidth": 100, "required": "", "requiredAttr": "", "requiredIf": "", "stripTags": "", "placeholder": "" } } So far everything works fine .
      When i try to export both fields together  i get only an error message :
      Call to a member function getModuleInfo() on null File: .../wire/modules/Fieldtype/FieldtypeText.module:171 I added " bd($textformatter);" on line 170 to see whats wrong. so have a look at the screenshot i apended to this post.
       
      Its perfectly possible that one textformater module got removed by accident while experimenting whith some textformaters but my question is how to fix this maybe somewhere in the DB and possibly what went wrong?
      ProcessWire 3.0.120 © 2019
      Apache/2.4.25 (FreeBSD) OpenSSL/1.0.2k mod_fcgid/2.3.9
      PHP 7.1.2
           

       
      Edit: After adding
      if ($textformatter ===NULL) continue; I can export my fields , as there arent any Textformater missing in the fields , i get a perfect result. but still there is one textformater whith a NULL value.  
       
       
    • By MoritzLost
      This will be more of a quick tip, and maybe obvious to many of you, but it's a technique I found very useful when building display options. By display options I mean fields that control how parts of the page are displayed on the frontend, for example background colors, sizing, spacing and alignment of certain elements. I'll also touch on how to make those options intuitive and comfortable to use for clients.
      It basically involves setting up option values that can be used directly in CSS classes or as HTML elements and mapping those to CSS styling (which can be quickly generated in a couple of lines using a pre-processor such as SASS). Another important aspect is to keep the option values seperate from their corresponding labels; the former can be technical, the latter should be semantically meaningful. The field type that lends itself to this this seperation of concerns is the Selectable Options field, the following examples mostly use this field type. Note that this module is part of the ProcessWire core, but not installed by default.
      The following examples all come from real projects I built (though some are slightly modified to better demonstrate the principle).
      #1: Headline levels & semantics
      For a project that had many pages with long texts, I used a Repeater field to represent sections of text. Each section has a headline. Those sections may have a hierarchical order, so I used a Selectable Option field to allow setting the headline level for each section (you can guess where this is going). The definition of the options looks something like this (those are all in the format value|label, with each line representing one option, see the blogpost above for details):
      h2|Section headline h3|Sub-section headline Of course, the PHP code that generates the corresponding HTML can use those values :
      // "sections" is the repeater field foreach ($page->sections as $section) { // create an h2 / h3 tag depending on the selected option (called headline_level here) echo "<{$section->headline_level->value}>{$section->headline}</{$section->headline_level->value}>"; echo $section->body; } That's a pretty obvious example, but there are two important takeaways:
      I only used two options. Just because there are six levels of headlines in HTML, doesn't mean those are all relevant to the client. The less options there are, the easier it is to understand them, so only the options that are relevant should be provided. In this case, the client had provided detailed, structured documents containing his articles, so I could determine how many levels of hierarchy were actually needed. I also started at h2, since there should be only one h1 per page, so that became it's own field separate from the repeater. The two options have a label that is semantically relevant to the client. It's much easier for a client who doesn't know anything about HTML to understand the options "Section headline" and "Sub-section headline" than "h2" and "h3". Sure, it can be cleared up in the field description, but this way it goes from something that's quickly explained to something that needs no explanation at all. #2: Image width and SASS
      In the same project, there was also an image section; in our layout, some images spanned the entire width of the text body, others only half of it. So again, I created an options field:
      50|Half width 100|Full width In this case, I expected the client to request different sizes at some point, so I wanted it to be extensible. Of course, the values could be used to generate inline styles, but that's not a very clean solution (since inline styled break the cascade, and it's not semantic as HTML should be). Instead, I used it to create a class (admittedly, this isn't strictly semantic as well):
      <img src="..." class="w-<?= $section->image_width->value ?>"> With pure CSS, the amount of code needed to write out those class definitions will increase linearly with the number of options. In SASS however, you only need a couple of lines:
      @each $width in (50, 100) { .w-#{$width}{ max-width: percentage($width/100); } } This way, if you ever need to add other options like 25% or 75%, you only need to add those numbers to the list in parenthesis and you're done. You can even put the definition of the list in a variable that's defined in a central variables.scss file. Something like this also exists in Bootstrap 4 as a utility, by the way.
      It also becomes easier to modifiy those all at once. For example, if you decide all images should be full-width on mobile, you only need to add that once, no need to throw around !important's or modify multiple CSS definitions (this is also where the inline styles approach would break down) :
      # _variables.scss $image-widths: (25, 50, 75, 100); $breakpoint-mobile: 576px; # _images.scss @import "variables"; @each $width in $image-widths { .w-#{$width}{ max-width: percentage($width/100); @media (max-width: $breakpoint-mobile) { max-width: 100%; } } } One important gotcha: It might be tempting to just use an integer field with allowed values between 10 - 100. In fact, the amount of SASS code would be identical with a @for-directive to loop throuh the numbers. But that's exactly what makes point-and-click page builders so terrible for clients: too many options. No client wants to manually set numerical values for size, position and margins for each and every element (looking at you, Visual Composer). In fact, having too many options makes it much harder to create a consistent layout. So in those cases, less is more.
      #3: Multiple options in one field
      Another example for repeatable page sections, this time for a two-column layout. The design included multiple variants regarding column-span and alignment. Using a 12-column grid, we needed a 6-6 split, a centered 5-5 split, a left-aligned 6-4 split and a right-aligned 4-6 split. I didn't want to litter the repeater items with options, so I decided to put both settings in one field (called something like Column layout) :
      center_6_6|6 / 6 (Centered) center_5_5|5 / 5 (Centered) left_6_4|6 / 4 (Left-aligned) right_4_6|4 / 6 (Right-aligned) As long as the value format is consistent, the individual options can be quickly extracted and applied in PHP:
      [$alignment, $width['left'], $width['right']] = explode('_', $section->column_layout->value); echo '<section class="row justify-content-' . $alignment . '">'; foreach (['left', 'right'] as $side) { echo '<div class="col-lg-' . $width[$side] . '">'; echo $section->get("body_{$side}"); echo '</div>'; } echo '</section>'; If you don't recognize the syntax in the first line, it's symmetric array destructuring, introduced in PHP 7.1. For older versions you can use list() instead. This example uses Bootstrap 4 grid classes and flexbox utility classes for alignment. The corresponding CSS can be quickly generated in SASS as well, check the Bootstrap source code for a pointer.
      Again, I'm limiting the options to what is actually needed, while keeping it extensible. With this format, I can easily add other column layouts without having to touch the code at all.
      #4: Sorting page elements
      A final example. In this case I was working on a template for reference projects that had three main content sections in the frontend: A project description, an image gallery and embedded videos (each using their own set of fields). The client requested an option to change the order in which those sections appeared on the page. Had I known this earlier, I maybe would have gone for a Repeater Matrix approach once again, but that would have required restructuring all the fields (and the corresponding code), so instead I used a Selectable Option field (labelled "Display order"). My approach is similar to the one from the last example:
      body_gallery_embeds|Description - Gallery - Videos body_embeds_gallery|Description - Videos - Gallery gallery_body_embeds|Gallery - Description - Videos gallery_embeds_body|Gallery - Videos - Description embeds_body_gallery|Videos - Description - Gallery embeds_gallery_body|Videos - Gallery - Description Since there are six possibilities to sort three items, this is the expected number of options. So I decided to include them all, even though some are probably never going to be used. I also tried to use a predictable order for the options (i.e. the options come in pairs, depending on what element is first). And here is the code used on the frontend:
      // render the template files for each section and store the result in an associative array $contents = [ 'body' => wireRenderFile('partials/_section-body.php', $page), 'gallery' => wireRenderFile('partials/_section-gallery.php', $page), 'embeds' => wireRenderFile('partials/_section-embeds.php', $page), ]; // e.g. 'gallery_body_embeds' => ['gallery', 'body', 'embeds']; $order = explode('_', $page->display_order->value); // echo the contents in the order defined by the option value foreach ($order as $item) { echo $contents[$item]; } You can see how it will be easy to add an additional section and integrate it into the existing solution. Though a fourth item would result in 4! = 24 possibilities to sort them, so at that point I'd talk to my client about which layouts they actually need 🙂
      Conclusion
      I always try to keep my code and the interfaces I create with ProcessWire extensible and intuitive. Those are a couple of solutions I came up with for projects at work. They are certainly not the only approach, and there is nothing super special about those examples, but I found that putting a little more effort into defining options with meaningful labels and using option values that I can use directly in my templates makes the result less verbose and more maintainable. Some or most of this tutorial may be immediately obvious to you, but if you made it this far, hopefully you got something out of it 🙂
      Feel free to share your own methods to create display options, or how you would've approached those problems differently. Thanks for reading!
×
×
  • Create New...