Jump to content
Peter Knight

MODX content (including images) to ProcessWire

Recommended Posts

I'm redeveloping a site from MODX to ProcessWire. The client has about 1200 pages and exporting the current database to a CSV and then importing content has been relatively easy using the CSV to pages module.

Client is fine with manually porting over 1200 images but as he's just had his 3rd child and is a busy man, i was wondering if there was a way to somehow pull in images from each MODX post and import into each new PW pages a Images field.

Looking for general thoughts. I'll have to outsource this as my own database chops are minimal but having some insight can help me brief a dev.

Many thanks

Peter

Share this post


Link to post
Share on other sites

I did this on a smaller scale. I'll see if I can find my code but it won't be til tomorrow unfortunately.

It basically just looked for the images in the HTML usng a preg_match_all I think, imported the images to the new page, replaced the image URLs in the HTML and saved the updated HTML. Worked quite well but I was doing it with few enough pages that I was checking them a page at a time.

That way was a little less system-specific actually.

Share this post


Link to post
Share on other sites

It is totally doable if you follow Ryan's tips here: https://processwire.com/talk/topic/3987-cmscritic-development-case-study/

It will be a good thing if you have imported the posts keeping somewhere the old ID, did you?

Yes, I have page IDs. I presume that's good for creating a new PW page with matching ID as the old one. The Page IDs are some kind of bridge or reference for fetching images?

It is totally doable if you follow Ryan's tips here: https://processwire.com/talk/topic/3987-cmscritic-development-case-study/

It will be a good thing if you have imported the posts keeping somewhere the old ID, did you?

Yes, I have page IDs. I presume that's good for creating a new PW page with matching ID as the old one. The Page IDs are some kind of bridge or reference for fetching images?

Share this post


Link to post
Share on other sites

I did this on a smaller scale. I'll see if I can find my code but it won't be til tomorrow unfortunately.

It basically just looked for the images in the HTML usng a preg_match_all I think, imported the images to the new page, replaced the image URLs in the HTML and saved the updated HTML. Worked quite well but I was doing it with few enough pages that I was checking them a page at a time.

That way was a little less system-specific actually.

Thanks Pete. I'd be interested in seeing that although I wouldn't attempt it myself.

I was thinking of dumping all 1200+ into a containing page called "to be sorted" and that way my client can then manually move them into their correct locations over time.

I know it's still quite a bit of manual work but the new structire and templates are still being designed so I need a holding bay for them until we're ready to figure out where they go.

I've created a temporary field in PW called "old URL" which contains the previous full url (folder/folder/pagename) so he can easily identify where a post sat originally.

Share this post


Link to post
Share on other sites

I've created a temporary field in PW called "old URL" which contains the previous full url (folder/folder/pagename) so he can easily identify where a post sat originally.

There you have your answer! Loop through all the pages in your PW install, get the url, pass it to a DOMparser, find all the images inside the content section from that page, get their url, and store them in this page image field.

Pseudo-code:

foreach ( $pages->find("template=all|imported|pages") as $p) {

    $html = file_get_html($p->old_url);
    
    foreach( $html->getElementById("content")->img as $img ) {

        $src = $img->src;
        
        // The image url is $src now
        // see the in this post in the cmscritic case study 
        // how to extract the file name, and use both the url and filename to import the images to the page in $p

    }
}

Share this post


Link to post
Share on other sites

diogo's pretty spot on there - I did this all back before I knew things like simplehtmldom (in his above post) existed. If you get the contents imported and use that to iterate through all the image fields, you can indeed then use the article he links to to have PW pull the image files into PW and replace the output with simplehtmldom as you go.

My code was so specific to the site I was working on I don't think it's worth posting but I will try and dig it out tomorrow to see if there are any other useful pointers that arose from the process.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By DooM
      Hello guys,
      I'm trying to figure out how to sync fields and templates between staging and production environments.
      I've found Migrations module by Lostkobrakai, but with use of it all the fields and templates must be created by API, which is kind of uncomfortable.
      I also tried ProcessDatabaseBackups module which can export only certain tables, but I don't think it's the best practice to do that.
      How do you guys solve this problem? It's very annoying to setup everything three times (dev, staging, production).
      Thanks a lot :)
    • By iipa
      Hi everybody!
      I have been reading about Multisite, but it kinda bugs me that every topic talks about having both admin and database same for multiple sites.
      I have a project where customer tests it by adding content to the site, while I still need to do some changes here and there in code, maybe some in database. If something crashes for a while, customer can't keep testing, which is a bit problematic.
      Is there any way that I could have two separate versions of one site ("production" and development) that share the same database, but are otherwise independent? Just the thought of having to migrate database every time I want to show client something new gives me anxiety 😁
    • By Falk
      Hi!
      After temporarily using Module Image Extra, which I completely removed, I had some troubles with my imagefield (unused table columns).
      So I just imported a previous version of this column via PHPmyAdmin, which worked pretty well.
      Anyways, in Processwire Backend all image tags are gone, although they are OK in the database.
      Other image related things work (thumbnail, title etc are OK).
      Is there any way to recreate all the images or something? Or may this be an cache-related issue?
       
      Thanks in advance 😃
       
    • By anttila
      We have many booking calendars made with ProcessWire (own databases) and I want to do a web app (SQL) which allows user to log in. First, the user chooses the right calendar and then (s)he have to log in. The user can be from any of those calendars and the app is not running on ProcessWire (it can if necessary). So if there any way to make sure that the user has rights to the calendar (s)he tries to log in and if the password is correct.
      Is there any better way to do this? I could also use PIN codes or something, but those need to be encrypted too.
      Multiple ProcessWires A lot of users per ProcessWire Everyone can log in to the web app (when using right calendar)
    • By nuel
      Hi there
      Basically I want to call code within a ProcessWire page that isn't used as a template. Example: www.mypwpage.com/myphpfile.php
      I have a working PW Website with a couple of pages like /artists, /releases, /videos etc. Now I need a page /download without any editable fields in the backend, just calling some PHP code (that was coded by another guy) containing a form that checks unique download-codes in a second database and starts the download of the desired file. The script is working fine right now as part of a static website, but since I built PW behind the site, this independent «Download Section» of the page doesn't work anymore.
      Right now I have the main file download.php as a page template on a newly created empty page called /download, so until now the form is working (wow). After sending the form containing the download-code, the file check_code.php in a subfolder /site/templates/download is called and that's where I get an error.
      Any help?
×
×
  • Create New...