Jump to content

Resolve a url to a page


mrjasongorman
 Share

Recommended Posts

I've been working with different CMS's for quite a few years now but there was always one thing that bugged me, I never knew how the CMS takes a url and resolves a page ID from it. I knew it was to do with the "slug" but what i couldn't figure out is when it came to sub pages, as the slug only refers to the page itself not the parent pages in the url e.g /parent-page/sub-page.

The main two CMS's i've worked with are Wordpress and ProcessWire, ProcessWire always has the upper hand when it comes to speed, even a large PW site is tens of milliseconds faster than a fresh Wordpress install.

With the resolution of urls to pages being (probably) the most used query in a CMS i thought i'd investigate the two different approaches.

Both ProcessWire and Wordpress split the urls by forward slash to extract the slugs /about/people/ => ['about', 'people'], however ProcessWire takes a completely different approach when it comes to resolving a page ID from these slugs. ProcessWire uses this query: 

SELECT pages.id,pages.parent_id,pages.templates_id
FROM `pages`
JOIN pages AS parent2 ON (pages.parent_id=parent2.id AND (parent2.name='about'))
JOIN pages AS rootparent ON (parent2.parent_id=rootparent.id AND rootparent.id=1)
WHERE pages.name='people'
AND (pages.status<9999999)
GROUP BY pages.id
LIMIT 0,1;

Resulting in a single item of the page in question including the page's template id. For urls with more parts it looks as though ProcessWire creates more JOINS to essentially walk back up the hierarchy and fully resolve the ID of the correct page. 

 

Wordpress on the other hand takes a different approach:

SELECT ID, post_name, post_parent, post_type 
FROM wp_posts 
WHERE post_name IN ('about','people') 
AND post_type IN ('page','attachment');

More elegant looking however it returns a list of potential items. So requires a loop within PHP to walk the hierarchy tree and determine the correct page ID.

Both queries once cached by MySQL took 19-21ms to return, ProcessWire looks as though it uses this to it's advantage by returning the correct page ID straight from the MySQL cache and doesn't require the extra looping step afterwards.

 

Interesting to see the different approaches to the same problems.

 

  • Like 3
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...