Jump to content

CMSCritic Development Case Study


ryan

Recommended Posts

  • 8 months later...

Just using a variation of this and, although I'm importing a large amount of posts and pages (pages was quite interesting - not a lot of changes required to import the tree structure as well! :)) I do still of course have to go through and check all the pages just in case.

This has been a huge help - thanks Ryan!

  • Like 1
Link to comment
Share on other sites

  • 3 months later...
  • 3 years later...

@ryan Thanks a lot for your detailed posts and code-examples! 

I will probably have to do a massive export from WP in the near future, so I was taking a closer look at the WP data structure, and ways how to accomplish it. In my case, the old WP site has so many plugins, that the whole DB queries are a nightmare. Exporting Tribe events is a bit tedious, but doable. With ACF though, it will be a nightmare (ACF is the standard commercial add-on if you'd like custom fields, but compared to PW it's kindergarten). Most probably, I will have to re-create those with Repeater Matrix, and re-build each item via API... sigh.

Link to comment
Share on other sites

On 6/22/2019 at 11:24 AM, dragan said:

I will probably have to do a massive export from WP in the near future, so I was taking a closer look at the WP data structure, and ways how to accomplish it. In my case, the old WP site has so many plugins, that the whole DB queries are a nightmare. Exporting Tribe events is a bit tedious, but doable. With ACF though, it will be a nightmare (ACF is the standard commercial add-on if you'd like custom fields, but compared to PW it's kindergarten). Most probably, I will have to re-create those with Repeater Matrix, and re-build each item via API... sigh.

I have no experience with this Tribe events thing, so can't speak for that, but for the bulk of the content I'd recommend skipping the database export idea and going with the built-in REST API. While the REST API has its quirks, going directly to database for exports is going to be a major pain in the a*s in comparison. Not entirely unlike exporting ProcessWire content with raw SQL... ?

It would be best to have a separate copy of the site at hand first, but after that it's basically as simple as installing the ACF to REST API plugin (which adds ACF field data to the REST API results) – and of course making sure that the REST API is enabled in the first place. You should be able to query and export all your pages, articles, and any custom post types with this method. Once you have the data, importing it to ProcessWire is going to be a breeze.

(Note: based on some quick googling the Tribe events plugin also provides REST API support. I haven't used this, but it looks promising.)

Also, in case that idea won't pan out, you can always rely on existing solutions, such as WP All Export (which provides a GUI for exporting content, including ACF data). Admittedly I haven't worked with this plugin before, but it's the companion plugin for WP All Import, the de facto standard plugin solution for complex imports. WP All Import is a bit of a monster and can feel clunky (hence devs often prefer custom imports for long-running, often-used, scheduled stuff), but for one-off cases it's a really handy tool.

--

Edit: in case anyone is wondering, the WP REST API was first announced on June 17th 2013, which would be a week or so after Ryan started this thread. It didn't make it's way into the core until 2015, and even then it was for a very long time considered "a work in progress". It's been more than five years since this thread was started, so it shouldn't come as a big surprise that some things have changed ?

  • Like 7
Link to comment
Share on other sites

@teppo Thanks for your insights and links. I have stumbled over import-/export WP plugins, and tried out two or three that sounded promising, but none of them did a clean job with ACF pages.

I will definitely take a closer look at WP REST.

The task I will definitely hate the most are internal links. WP stores them as hardcoded links. If all URLs inside PW will change* (conceptual question of course), updating those countless links inside RTE fields to avoid broken links will be tedious. But I'm sure I'll whip out something that'll work (regex, .htaccess 301 redirects, storing old URLs in hidden PW fields too etc.)

* i.e. in case we'll not keeping the old WP structure and hence not going to use /site/templates/includes/hooks.php like Ryan did (his first post in this thread)

  • Like 2
Link to comment
Share on other sites

On 6/22/2019 at 1:52 PM, dragan said:

The task I will definitely hate the most are internal links. WP stores them as hardcoded links. If all URLs inside PW will change* (conceptual question of course), updating those countless links inside RTE fields to avoid broken links will be tedious. But I'm sure I'll whip out something that'll work (regex, .htaccess 301 redirects, storing old URLs in hidden PW fields too etc.)

* i.e. in case we'll not keeping the old WP structure and hence not going to use /site/templates/includes/hooks.php like Ryan did (his first post in this thread)

That can definitely be a bit of a pain, and applies to any migration really ?

The good thing about the way WordPress handles internal links is that they are all (assuming they've not been modified via hooks or plugins, and a plugin hasn't been producing loads of "non-standard" links) absolute URLs with predefined prefix for each custom post type, and as such – in a lot of cases – you can just run a string replace on the exported data. If you're working on a copy of the site, you can also use WP-CLI (if you have it installed) and run something like "wp search-replace 'https://OLD' 'https://NEW' --all-tables" before the export.

Of course if you move things around a lot, it's not going to be quite as simple.

  • Like 2
Link to comment
Share on other sites

  • 1 year later...

  

On 6/6/2013 at 3:44 PM, ryan said:

What might also be of interest is the homepage template, as it handles the other part of routing of post URLs since they are living off the root rather than in /posts/. That means the homepage is what is triggering the render of each post:

/site/templates/home.php


if(strlen($input->urlSegment2)) {
  // we only accept 1 URL segment here, so 404 if there are any more
  throw new Wire404Exception();

} else if(strlen($input->urlSegment1)) {
  // render the blog post named in urlSegment1 
  $name = $sanitizer->pageName($input->urlSegment1);
  $post = $pages->get("/posts/")->child("name=$name");
  
  if($post->id) echo $post->render();
    else throw new Wire404Exception();
  
  // tell _main.php not to include itself after this
  $renderMain = false;

} else {
  // regular homepage output
  $limit = 7; // number of posts to render per page
  $posts = $pages->find("parent=/posts/, limit=$limit, sort=-date");
  $body = renderPosts($posts); 
}

 

After a lot of googling and many years this still seems like one of the best approaches to remove an undesirable part of a URL. For example, if you wanted to group a bunch of landing pages as children of "Landing Pages" but you didn't want /landing-pages in your URL.

However I did run into a pretty significant issue with it, it bypasses any permissions you have to prevent page view. So if guests dont have permission to view the page, it will still load that page

  • Like 1
Link to comment
Share on other sites

40 minutes ago, cjx2240 said:

However I did run into a pretty significant issue with it, it bypasses any permissions you have to prevent page view. So if guests dont have permission to view the page, it will still load that page

@cjx2240

This depends on the selector you use to grab those pages. In the example, a $post->get is used, meaning it will be retrieved, by-passing restrictions. Are you using similar code? Please see documentation:

https://processwire.com/docs/selectors/#access_control

Quote

Note that $pages->get(…); is not subject to this behavior (or access control) and include=all is assumed. This is because requesting a single page is a very specific request, and not typically used for generating navigation. To put it another way, if you are asking for a single specific page, we assume you mean it. If you want to retrieve a single page where include=all is not assumed, then use $pages->findOne(…) instead.

ps: I am not sure if ->child is also get-like.

  • Like 4
Link to comment
Share on other sites

  • 3 months later...
  • 1 year later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...