Jump to content

Weekly update – 17 June 2022: Making PW render at WP URLs


Recommended Posts

This week I've been busy developing a site. It's the same one as the last few weeks, which is an established site moving out of WordPress and into ProcessWire. Next week the whole thing is getting uploaded to its final server for collaboration and for client preview. I've been pretty focused on getting that ready and don't have any core updates to report this week, though should next week. 

One thing about the prior version of this site (and perhaps many WordPress sites) is that there wasn't much structure to the pages used in the site, and hundreds of unrelated pages in the site confusingly live off the root, such as /some-product/ and /some-press-release/ and /some-other-thing/. There's no apparent structure or order to it. And those pages that do have some loose structure in the wp-admin have URLs that don't represent that structure on the front-end. There's very little relation between the structure one sees in the wp-admin and the structure that one sees in the URLs, or in the front-end navigation. They all seem to be completely unrelated. That's one thing that I've tried to fix, so that there is some logic and structure rather than having a bunch of unrelated pages all in the same bucket (is this common in WordPress?)

But there's one big caveat. We didn't want to change anything about the actual URLs that are used on the site. This is a site with a long history, a lot of incoming links, and a lot of search traffic. The current URLs have been in place a long time and we didn't want to introduce more redirects into the site (there are already a ton of 301 redirects accumulated over time). So we wanted to make sure the existing URLs in the new ProcessWire-powered site are identical to what they were in the WordPress site. That might seem difficult to do, going from an unstructured WordPress site into a highly structured ProcessWire site... but actually it's very simple. Here's how:

Making ProcessWire render pages from WordPress URLs

We created a new URL field named "href" and added it to most of the site's templates. For established pages that came in from the old WP site, this "href" field contains the WordPress URL for the page. Depending on the page, it contains values like /some-product/ or /some-press-release/, /some-country/some-town/, etc. In most cases this is different from the actual ProcessWire URL, since the page structure is now much more orderly in the back-end. And for newly added pages, we'll be using the more logical ProcessWire URL. But for all the old WordPress pages, we'll make them render from their original URL. This is certainly preferable from an SEO standpoint but also helps to limit the redirects in the site.

In order to make $page->url return the old/original WordPress URL (value in $page->href), a hook was added to the /site/init.php file:

/**
 * Update page paths and URLs to use the 'href' field value on pages that have it
 *
 */
$wire->addHookBefore('Page::path', function(HookEvent $event) {
  $page = $event->object; /** @var Page $page */
  $href = $page->get('href');
  if(!$href) return; // skip if page has no 'href' value
  $event->return = $href;
  $event->replace = true;
});

Now we've got $page->url calls returning the URLs we want, but how do we get ProcessWire to accept those URLs for rendering pages?

The first thing we'll need to do is enable URL segments for the homepage template. We do this by going to: Setup > Templates > home > URLs > and check the box to enable URL segments. Save.

Then we need to edit our /site/templates/home.php to identify when URL segments are present and render the appropriate page, rather than the homepage:

$href = $input->urlSegmentStr;
if($href) {
  $href = '/' . trim($href, '/') . '/';
  $page = $pages->get("href=$href");
  if(!$page->id || !$page->viewable()) wire404();
  $wire->wire('page', $page); // set new $page API var
  include($page->template->filename); // include page's template file
} else {
  // render homepage output
}

As you can see from the above, when URL segments are present, we find a page that has an "href" field value matching those URL segments ($input->urlSegmentStr). If we don't find one, we stop with a 404. But if we do find one, then we set the $page API variable to it and then include its template file, making that page render rather than the homepage. If there is no $input->urlSegmentStr present then of course we just render the homepage.

That's it! By using these little bits of code snippets to replace ProcessWire's URL routing, now all the URLs of the old WordPress site are handled by ProcessWire. 

Like most things in ProcessWire, there's more than one way to accomplish this…

We could have used URL/path hooks, or we could have hooked before Page::render to take over homepage requests with URL segments, before the homepage even got involved. Or perhaps we could have hooked in even earlier, to something in the ProcessPageView module or PagesRequest class. Or we could have used an existing module. Any of these might be equally good (or even better) solutions, but I just went with what seemed like the simplest route, one that I could easily see and adjust. Plus, it'll work in any version of ProcessWire. 

The actual solution I used is a little more than what's presented here, as it has a few fallbacks for finding pages and scanning redirect lists, plus passes along remaining pagination/URL segments to rendered pages. I'm guessing most don't need that stuff, and it adds a decent chunk of code, so I left that out. But there are a couple of optional additions that I would recommend adding in a lot of cases:

Forcing pages to only render from their "href" URL and not their ProcessWire URL (optional)

Our existing hooks ensure that any URLs output for pages having "href" values are always based on that "href" value. But what if someone accesses a given page at its ProcessWire path/url? The page would render normally. We might want to prevent that from happening, ensuring that it only ever renders at its defined "href" URL instead. If the page is requested at its ProcessWire URL, then we redirect to its "href" URL. We can do that by adding the following code to our /site/templates/_init.php file:

if($page->id > 1) {
  // ensure that we are rendering from the 'href' URL
  // rather than native PW url, when href is populated
  $href = $page->get('href');
  $requestUrl = isset($_SERVER['REQUEST_URI']) ? $_SERVER['REQUEST_URI'] : '';
  if($href && $requestUrl && strpos($requestUrl, $href) === false) {
    // href value is not present in request URL, so redirect to it
    $session->redirect($config->urls->root . ltrim($href, '/'));
  }
}

Enforcing uniqueness and slashes in "href" values (optional)

It's worthwhile to enforce our "href" values being consistent in having both a leading and trailing slash, as well as making sure they are always unique, so no two pages can have the same "href" value. To do that, I added this hook to my /site/ready.php file (/site/init.php would also work): 

/**
 * Ensure that any saved 'href' values are unique and have leading/trailing slashes
 * 
 */
$wire->addHookAfter('Pages::savePageOrFieldReady', function(HookEvent $event) {
  
  $page = $event->arguments(0); /** @var Page $page */
  $fieldName = $event->arguments(1); /** @var string $fieldName */
  
  if($fieldName === 'href' || in_array('href', $page->getChanges())) {
    // the href field is being saved 
    $href = $page->get('href');
    if(!strlen($href)) return;
  
    // make sure value has leading and trailing slashes
    if(strpos($href, '/') === 0 && substr($href, -1) === '/') {
      // already has slashes
    } else {
      $href = '/' . trim($href, '/') . '/';
      $page->set('href', $href);
    }

    // make sure that the 'href' value is unique
    $pages = $event->object; /** @var Pages $pages */
    $p = $pages->get("id!=$page->id, href=$href");
    if($p->id && !$p->isTrash()) {
      $page->set('href', '');
      $page->error(
        "Path of '$href' is already in use by page $p->id “{$p->title}” - " . 
        "Please enter a different “href” path and save again"
      );
    }
  }
});

That's all for this week. Thanks for reading, have a great weekend!

  • Like 20
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...