Jump to content
Sign in to follow this  
mrjasongorman

Resolve a url to a page

Recommended Posts

I've been working with different CMS's for quite a few years now but there was always one thing that bugged me, I never knew how the CMS takes a url and resolves a page ID from it. I knew it was to do with the "slug" but what i couldn't figure out is when it came to sub pages, as the slug only refers to the page itself not the parent pages in the url e.g /parent-page/sub-page.

The main two CMS's i've worked with are Wordpress and ProcessWire, ProcessWire always has the upper hand when it comes to speed, even a large PW site is tens of milliseconds faster than a fresh Wordpress install.

With the resolution of urls to pages being (probably) the most used query in a CMS i thought i'd investigate the two different approaches.

Both ProcessWire and Wordpress split the urls by forward slash to extract the slugs /about/people/ => ['about', 'people'], however ProcessWire takes a completely different approach when it comes to resolving a page ID from these slugs. ProcessWire uses this query: 

SELECT pages.id,pages.parent_id,pages.templates_id
FROM `pages`
JOIN pages AS parent2 ON (pages.parent_id=parent2.id AND (parent2.name='about'))
JOIN pages AS rootparent ON (parent2.parent_id=rootparent.id AND rootparent.id=1)
WHERE pages.name='people'
AND (pages.status<9999999)
GROUP BY pages.id
LIMIT 0,1;

Resulting in a single item of the page in question including the page's template id. For urls with more parts it looks as though ProcessWire creates more JOINS to essentially walk back up the hierarchy and fully resolve the ID of the correct page. 

 

Wordpress on the other hand takes a different approach:

SELECT ID, post_name, post_parent, post_type 
FROM wp_posts 
WHERE post_name IN ('about','people') 
AND post_type IN ('page','attachment');

More elegant looking however it returns a list of potential items. So requires a loop within PHP to walk the hierarchy tree and determine the correct page ID.

Both queries once cached by MySQL took 19-21ms to return, ProcessWire looks as though it uses this to it's advantage by returning the correct page ID straight from the MySQL cache and doesn't require the extra looping step afterwards.

 

Interesting to see the different approaches to the same problems.

 

  • Like 3

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By humanafterall
      Hi,
      I have a URL field that will sometimes have relative/local URLs on a multilingual site, for example /contact/ 

      However the URL field does not seem to pick up when I'm on another language, for example /fr/ so I'm taken to the default language page for /contact/ rather than /fr/contact/
      Is there a way to make the URL fields play well with a multi-language site?
      Thanks!
       
    • By Robin S
      A new module that hasn't had a lot of testing yet. Please do your own testing before deploying on any production website.
      Custom Paths
      Allows any page to have a custom path/URL.
      Note: Custom Paths is incompatible with the core LanguageSupportPageNames module. I have no experience working with LanguageSupportPageNames or multi-language sites in general so I'm not in a position to work out if a fix is possible. If anyone with multi-language experience can contribute a fix it would be much appreciated!
      Screenshot

      Usage
      The module creates a field named custom_path on install. Add the custom_path field to the template of any page you want to set a custom path for. Whatever path is entered into this field determines the path and URL of the page ($page->path and $page->url). Page numbers and URL segments are supported if these are enabled for the template, and previous custom paths are managed by PagePathHistory if that module is installed.
      The custom_path field appears on the Settings tab in Page Edit by default but there is an option in the module configuration to disable this if you want to position the field among the other template fields.
      If the custom_path field is populated for a page it should be a path that is relative to the site root and that starts with a forward slash. The module prevents the same custom path being set for more than one page.
      The custom_path value takes precedence over any ProcessWire path. You can even override the Home page by setting a custom path of "/" for a page.
      It is highly recommended to set access controls on the custom_path field so that only privileged roles can edit it: superuser-only is recommended.
      It is up to the user to set and maintain suitable custom paths for any pages where the module is in use. Make sure your custom paths are compatible with ProcessWire's $config and .htaccess settings, and if you are basing the custom path on the names of parent pages you will probably want to have a strategy for updating custom paths if parent pages are renamed or moved.
      Example hooks to Pages::saveReady
      You might want to use a Pages::saveReady hook to automatically set the custom path for some pages. Below are a couple of examples.
      1. In this example the start of the custom path is fixed but the end of the path will update dynamically according to the name of the page:
      $pages->addHookAfter('saveReady', function(HookEvent $event) { $page = $event->arguments(0); if($page->template == 'my_template') { $page->custom_path = "/some-custom/path-segments/$page->name/"; } }); 2. The Custom Paths module adds a new Page::realPath method/property that can be used to get the "real" ProcessWire path to a page that might have a custom path set. In this example the custom path for news items is derived from the real ProcessWire path but a parent named "news-items" is removed:
      $pages->addHookAfter('saveReady', function(HookEvent $event) { $page = $event->arguments(0); if($page->template == 'news_item') { $page->custom_path = str_replace('/news-items/', '/', $page->realPath); } }); Caveats
      The custom paths will be used automatically for links created in CKEditor fields, but if you have the "link abstraction" option enabled for CKEditor fields (Details > Markup/HTML (Content Type) > HTML Options) then you will see notices from MarkupQA warning you that it is unable to resolve the links.
      Installation
      Install the Custom Paths module.
      Uninstallation
      The custom_path field is not automatically deleted when the module is uninstalled. You can delete it manually if the field is no longer needed.
       
      https://github.com/Toutouwai/CustomPaths
      https://modules.processwire.com/modules/custom-paths/
    • By AndZyk
      Hello,
      I am currently building a intranet which will be hosted on the local network of a company. This intranet has many links to files on their fileserver with the protocol file://.
      So for example the links look like this file://domain.tld/filename.ext
      When I try to insert such a link into a URL field, I get the error, that only the protocol http:// is allowed. When I try to insert such a link into a CKEeditor link, it gets stripped out. Is it possible to insert such links into the FieldType URL and CKEditor.
      I know that I could use a FieldType Text or insert a RewriteRule in the .htaccess file, but I am looking for a more elegant solution. 😉
      Regards, Andreas
    • By stanoliver
      Hi! 
      The following code snippet is part of my markup simple navigation and the url_redirect (url field in the backend) just works fine when I put an special custom url into the url_redirect field.
      <?php $nav = $modules->get("MarkupSimpleNavigation"); // topnav $current = $page->rootParent(); // current root page // subnav echo $nav->render(array( 'max_levels' => 2, 'item_tpl' => '<h4><a class="g8-bar-item g8-hover-black" href="{url_redirect|url}">{title}</a></h4><hr class="sidenav">', 'collapsed' => true, ), $page, $current); ?>  In my seperated breadcrumb navigation I use the following code snippet
      <?php foreach($page->parents()->append($page) as $parent) { echo "<li><a href='{$parent->url_redirect|url}'>{$parent->title}</a></li>"; } ?> Now to the problem: In my first code snippet above the
      url_redirect|url 
      works just fine but when I try something similiar in the second code snippet
      $parent->url_redirect|url
      I produce an server error How do I have to change the second code snippet so that it works in the correct way as the first code snippet does?
    • By gregory
      Hi everyone.
      I can't see the PDF uploaded via a field called "pdf".
      I get a url like this: http://localhost:8888/mywebsite/site/assets/files/1129/test.pdf%EF%BB%BF%EF%BB%BF
      Could anyone help me? Thank you.
      <?php foreach($page->case_studies as $item) { if($item->type == 'contenuto') { echo " <a class='btn btn-primary btn-sm' target='blank' href='{$item->pdf->first()->url}' role='button'>Download</a> "; } } ?>  
×
×
  • Create New...