Resolve a url to a page

mrjasongorman · December 17, 2017

I've been working with different CMS's for quite a few years now but there was always one thing that bugged me, I never knew how the CMS takes a url and resolves a page ID from it. I knew it was to do with the "slug" but what i couldn't figure out is when it came to sub pages, as the slug only refers to the page itself not the parent pages in the url e.g /parent-page/sub-page.

The main two CMS's i've worked with are Wordpress and ProcessWire, ProcessWire always has the upper hand when it comes to speed, even a large PW site is tens of milliseconds faster than a fresh Wordpress install.

With the resolution of urls to pages being (probably) the most used query in a CMS i thought i'd investigate the two different approaches.

Both ProcessWire and Wordpress split the urls by forward slash to extract the slugs /about/people/ => ['about', 'people'], however ProcessWire takes a completely different approach when it comes to resolving a page ID from these slugs. ProcessWire uses this query:

SELECT pages.id,pages.parent_id,pages.templates_id
FROM `pages`
JOIN pages AS parent2 ON (pages.parent_id=parent2.id AND (parent2.name='about'))
JOIN pages AS rootparent ON (parent2.parent_id=rootparent.id AND rootparent.id=1)
WHERE pages.name='people'
AND (pages.status<9999999)
GROUP BY pages.id
LIMIT 0,1;

Resulting in a single item of the page in question including the page's template id. For urls with more parts it looks as though ProcessWire creates more JOINS to essentially walk back up the hierarchy and fully resolve the ID of the correct page.

Wordpress on the other hand takes a different approach:

SELECT ID, post_name, post_parent, post_type 
FROM wp_posts 
WHERE post_name IN ('about','people') 
AND post_type IN ('page','attachment');

More elegant looking however it returns a list of potential items. So requires a loop within PHP to walk the hierarchy tree and determine the correct page ID.

Both queries once cached by MySQL took 19-21ms to return, ProcessWire looks as though it uses this to it's advantage by returning the correct page ID straight from the MySQL cache and doesn't require the extra looping step afterwards.

Interesting to see the different approaches to the same problems.

Sign In

Resolve a url to a page

Recommended Posts

mrjasongorman

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members

Similar Content

Error: Exception: SQLSTATE[42S22]: Column not found: 1054 Unknown column 'page_path_history.language_id' in 'field list' (in wire/modules/PagePathHistory.module line 752)

Losing parameters when using pagination

Multi-language URL field

Custom Paths

Allow protocol file:// in FieldType URL and CKEditor links

Browse

Activity

My Activity Streams

Support

Store

My Details