Jump to content

How to import HTML with images?

Recommended Posts

I am after a bit of guidance on importing content to ProcessWire. I am using a program called Clarify to generate work instructions. I would like to use ProcessWire to put together a knowledgebase of work instructions.

I have Clarify outputting an HTML file and a folder of images for each work instruction which I would like to become a page in PW.

An example of the HTML output is attached.

How can I import these pages and the folder of images and keep the image links intact?

Many Thanks



Share this post

Link to post
Share on other sites

For the topic how to import / batchimport content into PW you will find many posts with examples in the forums, - so I will focus on your special issue with your images that are linked to "./images/...." in your htmlContent.

Assumed, you use an importer script, that reads the following from a source:

  • title / name of an entry
  • html content
  • images basenames

Assumed, that the importer script can access the imagefiles / knows where they are kept for the import.
And assumed that you have created a template with fields for those pages you want to create during the import.
Then you will come to a point, where you programatically create a new page:

$p = new Page($myTemplateObject);
$p->title = $theEntryTitle;
$p->parent = $theParentContainerPage;

Now you can add / import your images to an imagefield:

foreach($imagebasenameArray as $basename) {
    $p->imagefield->add($directorypathToImages . $basename);

Now all images of that entry resides in /site/assets/files/{PAGE_ID}/. And you have to modify your html-content to match this:

$p->body = str_replace("./images/", $config->urls->files . $p->id . "/", $htmlContent);

This way you will have the most possible flexibility with your pages, I think.


Please note: written in the browser, - may have bugs, - is meant as pseudo code. :)

  • Like 4

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Rodd
      Hi everyone!
      I have a website in a production environment and I want to duplicate it in a local environment. I exported the content of the website (with the 'Site Profile Exporter' module) but I cannot use it actually. I've got an issue with the database. I imported this one in MAMP then.

      I also exported the pages (with the 'ProcessPagesExportImport' module), but I cannot import it to my local website because the fields don't exist. So I created this fields, but I have this error :
      How can I use the elements that already exist and are presents in my database? How can I duplicate correctly the templates, fields and pages?
      Thanks by advance
      PS: Sorry if my english is bad
    • By hellerdruck
      Hi all
      I need help with something. Situation: We have let's say 2'000 Files (Excel) that should be displayed (list with links) on a page. We'd need to filter these files by given Keywords or a tree structure or both. Now, I'm looking for a solution whereas our customer can synchronise the files from his local computer with the folder on the webserver. They will update and upload files on a daily basis. Therefore, it would need to synchronise rather than load the files manually in pages or repeaters. Maybe indexing would be an idea, too.
      Are there any modules for Processwire that would help achieving this? Could anyone point me in the right direction?
      Thanks in advance.
    • By iNoize
      Hello, need some help for an RealEstate project. It have to use the OnOffice to import the objects. 
    • By hellomoto
      I can't tell what's wrong; my local development version appears just fine, but I copy over the site files and db online and the homepage content is not being contained. This is what it should look like (same site in the same browser, running on my localhost): http://imgur.com/UFZFzrd
      What could be the problem here? Sorry to bring up something so irrelevant to PW here, I just know that you all are a valiant and helpful group, and no one on StackExchange seems to even know what I'm talking about.
      Thanks a lot.
    • By maba
      I need to import regularly - every 15 or 30 days - a big .xslx file into my PW installation.
      This file now has 14 columns, 5.000 rows and grows every month.
      I'll need to group, order and work with these data to:
      analyse User monthly costs analyse User costs per Asset ... User (real AD account) has to match with a PW user - I can't join to the domain - but as you can see I have some services users (start with sca_*) or no user at all. Those rows have to be assigned to a specific user, e.g. account100.
      I would like to be able to have a kind of diff function to compare User assets between this and last month (and so on) other request is to have a notification when something change for a User between actual and latest import First request: which is the best solution to store those data in your opinion? Page, Table, Repeater Matrix, ...?
      Those are very repetitive data and I think a page reference is better than to import all the data every time but I have to understand how to manage those "dynamic" groups of software (AccType Det), hardware (Asset), ... For example Price will be imported and not stored with the description because it could be change in the future and I'll not have any control on it.
      User,OE,productNmr,AccType1,AccType Det,Count,Price (€),Sum,ASNA,CC,AccType Info,Asset,AccGroup,,,,,,,,,,,,,
  • Create New...