ProcessWire updates and additions in progress

3 April 2020 by Ryan Cramer 3 Comments

This week we’ve got a few new and interesting core updates in progress, though none quite ready to release just yet. So rather than releasing version 3.0.154 today, I thought we'd instead take a brief look at what’s coming over the next few weeks…

Expanded file and image field properties
New PageFamily API and class
New ProCache version with built-in crawler

Expanded file and image field properties

ProcessWire’s file and image fields consider the files they store to be the source of properties like width, height, file size (bytes) and such. Meaning, this information is not stored in the database and is instead obtained directly from the files on the file system. That makes it quite efficient in that there’s no overlap or redundancy in duplicating information.

While this setup is efficient, there are also some drawbacks. This information obtained from the files can’t be easily searched the way everything else in ProcessWire is. For instance, we can’t currently do things like find all pages that have an image at least 1000 pixels wide and 600 pixels tall (or some other dimensions you choose). Nor can we tell which images are portrait or landscape (or some aspect ratio) without loading the page and accessing the images. When it comes to storage, we can’t move our assets to some other file storage system (like S3) since PW needs direct file system access to the files in order to know anything about their size or dimension.

Given all of this, I’ve been working on improving our file and image fields so that properties like file size, dimensions, aspect ratio and variations become properties of the file fields that are stored in the database. That way the information can be searched with the page finder, and it can be accessed and queried without needing to hit the file system. Not only does this have search and performance benefits, but it also means it’ll open the door to supporting more asset storage options.

This is a capability that will be rolled out in parts because it’s not something that can be just turned on and off. In order to store the information in the first place, it needs to load all those pages and access all their files. It might be something that takes some time to do on an existing large site. So what I’m looking to do in the short term is just have PW start to maintain and store the information as files are uploaded and/or pages are saved, ensuring its presence for future projects. Of course, we’ll also come up with a way to pre-populate the info for existing sites too.

Following these file and image field improvements, I’m also looking to provide API methods dedicated to searching files/images independently of pages. So if you wanted to find one (or many) images matching some criteria, you could do so, and it would find and return them regardless of what page they were found on (while still adhering to access control of course).

New PageFamily API and class

ProcessWire’s rules for family relations between pages are currently not in any single defined location and I’ve been looking to give them a more formal and central API. This encompasses logic like identifying what parent and child types are allowed for any given page, where pages are allowed to be placed, and how many, and so on.

Currently the logic is kind of split between access control, template, page adding and saving methods/processes, with logic directly where it’s needed. This works fine, but as new needs arise, I’ve been wanting to bring these under a central location to avoid potential redundancy in logic. Much of it is currently being built out in a new PageFamily class. It is used by Page objects as a helper in the same way the existing PageTraversal and PageComparison classes are. I’ll cover more on this once it’s ready to commit to the core.

New ProCache version with built-in crawler

Another thing I’ve been working on here is a new version of ProCache. It’s a major overhaul and refactoring of the module which finally drops support for PW 2.x versions and now takes advantage of being fully PW 3.x native.

One of the major new features in this version is a built-in web crawler that can crawl all the pages in your site, or just those matching some criteria that you specify. The main purpose of the crawler is so that you can now have the option of priming the cache with the crawler rather than with live traffic.

The crawler is also useful for other things too. For instance, I’m currently developing an existing site with a new design where all the image sizes of changed. Because there are a lot of images, and we need to generate multiple size variations (plus WebP versions of each), the first render can take literal minutes in some cases. So I let a crawler run overnight to take care of pre-rending all those image sizes and variations on the development site so that live traffic never has to wait.

The ProCache crawler is unique in that it’s a PW-specific crawler and knows about how PW handles multi-language, pagination and URL segments. It gets URLs to crawl directly from the database, so is not dependent upon link structure in the markup (though it is able to read it as well). It can also focus in on specific groups of pages, such as all pages using a particular template, or literally anything you can specify with a $pages->find() selector. The ProCache crawler can be used from the ProCache admin (via a special console window like the one in ListerPro), but it can also be used from the API, like in a shell script triggered from a cron job.

If all continues well with development and testing, this new version of ProCache should hopefully be ready by the end of the month.

Thanks for reading, and stay tuned for more details on all of these things as they are completed. So consider this just a brief introduction to some work in progress. I hope you all have a good weekend and be well. Enjoy reading the ProcessWire Weekly for the latest ProcessWire news and updates, and subscribe to the weekly newsletter.

Comments

Joe
- 6 years ago
- 6 0
Excited that we are working to being able to use s3 as a file store!
Reply
dotnetic
- 6 years ago
- 2 2
Nice additions. Looking forward to the new ProCache version.
Reply
Markus Tiefenbacher
- 6 years ago
- 3 1
★★★★★
Cool, we use Adobe Scene7 and Cloudinary (not public modules) as Filestore for images.
Reply

PrevProcessWire 3.0.153 core updates

20 March 2020

This latest version of the core on the dev branch focuses on comments field updates, significant refactoring/improvements to ProcessWire’s core Template class and PagesLoader class (which is used by the $pages API variable), and we introduce a useful new $pages API method. More

NextProcessWire 3.0.154 and 3.0.155 core updates

24 April 2020

This post covers a few of the bigger updates in ProcessWire 3.0.154 and 3.0.155 on the dev branch. This includes a new function for live replacement of text in core and modules, a new method for creating canonical URLs, and some major upgrades to our $input->urlSegment() method that I think you’ll like! More