Sign in to follow this  
mrjasongorman

Resolve a url to a page

Recommended Posts

I've been working with different CMS's for quite a few years now but there was always one thing that bugged me, I never knew how the CMS takes a url and resolves a page ID from it. I knew it was to do with the "slug" but what i couldn't figure out is when it came to sub pages, as the slug only refers to the page itself not the parent pages in the url e.g /parent-page/sub-page.

The main two CMS's i've worked with are Wordpress and ProcessWire, ProcessWire always has the upper hand when it comes to speed, even a large PW site is tens of milliseconds faster than a fresh Wordpress install.

With the resolution of urls to pages being (probably) the most used query in a CMS i thought i'd investigate the two different approaches.

Both ProcessWire and Wordpress split the urls by forward slash to extract the slugs /about/people/ => ['about', 'people'], however ProcessWire takes a completely different approach when it comes to resolving a page ID from these slugs. ProcessWire uses this query: 

SELECT pages.id,pages.parent_id,pages.templates_id
FROM `pages`
JOIN pages AS parent2 ON (pages.parent_id=parent2.id AND (parent2.name='about'))
JOIN pages AS rootparent ON (parent2.parent_id=rootparent.id AND rootparent.id=1)
WHERE pages.name='people'
AND (pages.status<9999999)
GROUP BY pages.id
LIMIT 0,1;

Resulting in a single item of the page in question including the page's template id. For urls with more parts it looks as though ProcessWire creates more JOINS to essentially walk back up the hierarchy and fully resolve the ID of the correct page. 

 

Wordpress on the other hand takes a different approach:

SELECT ID, post_name, post_parent, post_type 
FROM wp_posts 
WHERE post_name IN ('about','people') 
AND post_type IN ('page','attachment');

More elegant looking however it returns a list of potential items. So requires a loop within PHP to walk the hierarchy tree and determine the correct page ID.

Both queries once cached by MySQL took 19-21ms to return, ProcessWire looks as though it uses this to it's advantage by returning the correct page ID straight from the MySQL cache and doesn't require the extra looping step afterwards.

 

Interesting to see the different approaches to the same problems.

 

  • Like 3

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By zota
      Hi,
      After 2 or 3 cliks on my nav bar the browser starts to show
      http://julio.x10host.com/home/juliox15/public_html/logos/
      instead of
      http://julio.x10host.com/logos/
      Where or what should I start the fixing?
      Thanks
    • By louisstephens
      So I have a project where multiple pages are sending POST data to 1 single template page.  All was working well (well, at least with one ajax post), but now I have hit a stumbling block. I figured  the "best" way to handle the request were to use url segments and then use the following in the status page:
      if ($config->ajax && $input->urlSegment1 == 'add-bookmark') { // some code here } However, this doesnt seem to really work (as I assume the the request isnt being posted to /status/ but rather to /status/add-bookmark/). What is the best way to actually handle this?
    • By Noel Boss
      Page Query Boss
      Build complex nested queries containing multiple fields and pages and return an array or JSON. This is useful to fetch data for SPA and PWA.
      You can use the Module to transform a ProcessWire Page or PageArray – even RepeaterMatrixPageArrays – into an array or JSON. Queries can be nested and contain closures as callback functions. Some field-types are transformed automatically, like Pageimages or MapMarker.
      Installation
      Via ProcessWire Backend
      It is recommended to install the Module via the ProcessWire admin "Modules" > "Site" > "Add New" > "Add Module from Directory" using the PageQueryBoss class name.
      Manually
      Download the files from Github or the ProcessWire repository: https://modules.processwire.com/modules/page-query-builder/
      Copy all of the files for this module into /site/modules/PageQueryBoss/ Go to “Modules > Refresh” in your admin, and then click “install” for the this module. Module Methods
      There are two main methods:
      Return query as JSON
      $page->pageQueryJson($query); Return query as Array
      $page->pageQueryArray($query); Building the query
      The query can contain key and value pairs, or only keys. It can be nested and 
      contain closures for dynamic values. To illustrate a short example:
      // simple query: $query = [ 'height', 'floors', ]; $pages->find('template=skyscraper')->pageQueryJson($query); Queries can be nested, contain page names, template names or contain functions and ProcessWire selectors:
      // simple query: $query = [ 'height', 'floors', 'images', // < some fileds contain default sub-queries to return data 'files' => [ // but you can also overrdide these defaults: 'filename' 'ext', 'url', ], // Assuming there are child pages with the architec template, or a // field name with a page relation to architects 'architect' => [ // sub-query 'name', 'email' ], // queries can contain closure functions that return dynamic content 'querytime' => function($parent){ return "Query for $parent->title was built ".time(); } ]; $pages->find('template=skyscraper')->pageQueryJson($query); Keys:
      A single fieldname; height or floors or architects 
      The Module can handle the following fields:
      Strings, Dates, Integer… any default one-dimensional value Page references Pageimages Pagefiles PageArray MapMarker FieldtypeFunctional A template name; skyscraper or city
      Name of a child page (page.child.name=pagename); my-page-name A ProcessWire selector; template=building, floors>=25
      A new name for the returned index passed by a # delimiter:
      // the field skyscraper will be renamed to "building": $query = ["skyscraper`#building`"]  
      Key value pars:
      Any of the keys above (1-5) with an new nested sub-query array:
      $query = [ 'skyscraper' => [ 'height', 'floors' ], 'architect' => [ 'title', 'email' ], ]  
      A named key and a closure function to process and return a query. The closure gets the parent object as argument:
      $query = [ 'architecs' => function($parent) { $architects = $parent->find('template=architect'); return $architects->arrayQuery(['name', 'email']); // or return $architects->explode('name, email'); } ] Real life example:
      $query = [ 'title', 'subtitle', // naming the key invitation 'template=Invitation, limit=1#invitation' => [ 'title', 'subtitle', 'body', ], // returns global speakers and local ones... 'speakers' => function($page){ $speakers = $page->speaker_relation; $speakers = $speakers->prepend(wire('pages')->find('template=Speaker, global=1, sort=-id')); // build a query of the speakers with return $speakers->arrayQuery([ 'title#name', // rename title field to name 'subtitle#ministry', // rename subtitle field to ministry 'links' => [ 'linklabel#label', // rename linklabel field to minlabelistry 'link' ], ]); }, 'Program' => [ // Child Pages with template=Program 'title', 'summary', 'start' => function($parent){ // calculate the startdate from timetables return $parent->children->first->date; }, 'end' => function($parent){ // calculate the endate from timetables return $parent->children->last->date; }, 'Timetable' => [ 'date', // date 'timetable#entry'=> [ 'time#start', // time 'time_until#end', // time 'subtitle#description', // entry title ], ], ], // ProcessWire selector, selecting children > name result "location" 'template=Location, limit=1#location' => [ 'title#city', // summary title field to city 'body', 'country', 'venue', 'summary#address', // rename summary field to address 'link#tickets', // rename ticket link 'map', // Mapmarker field, automatically transformed 'images', 'infos#categories' => [ // repeater matrix! > rename to categories 'title#name', // rename title field to name 'entries' => [ // nested repeater matrix! 'title', 'body' ] ], ], ]; if ($input->urlSegment1 === 'json') { header('Content-type: application/json'); echo $page->pageQueryJson($query); exit(); } Module default settings
      The modules settings are public. They can be directly modified, for example:
      $modules->get('PageQueryBoss')->debug = true; $modules->get('PageQueryBoss')->defaults = []; // reset all defaults Default queries for fields:
      Some field-types or templates come with default selectors, like Pageimages etc. These are the default queries:
      // Access and modify default queries: $modules->get('PageQueryBoss')->defaults['queries'] … public $defaults = [ 'queries' => [ 'Pageimages' => [ 'basename', 'url', 'httpUrl', 'description', 'ext', 'focus', ], 'Pagefiles' => [ 'basename', 'url', 'httpUrl', 'description', 'ext', 'filesize', 'filesizeStr', 'hash', ], 'MapMarker' => [ 'lat', 'lng', 'zoom', 'address', ], 'User' => [ 'name', 'email', ], ], ]; These defaults will only be used if there is no nested sub-query for the respective type. If you query a field with complex data and do not provide a sub-query, it will be transformed accordingly:
      $page->pageQueryArry(['images']); // returns something like this 'images' => [ 'basename', 'url', 'httpUrl', 'description', 'ext', 'focus'=> [ 'top', 'left', 'zoom', 'default', 'str', ] ]; You can always provide your own sub-query, so the defaults will not be used:
      $page->pageQueryArry([ 'images' => [ 'filename', 'description' ], ]); Overriding default queries:
      You can also override the defaults, for example
      $modules->get('PageQueryBoss')->defaults['queries']['Pageimages'] = [ 'basename', 'url', 'description', ]; Index of nested elements
      The index for nested elements can be adjusted. This is also done with defaults. There are 3 possibilities:
      Nested by name (default) Nested by ID Nested by numerical index Named index (default):
      This is the default setting. If you have a field that contains sub-items, the name will be the key in the results:
      // example $pagesByName = [ 'page-1-name' => [ 'title' => "Page one title", 'name' => 'page-1-name', ], 'page-2-name' => [ 'title' => "Page two title", 'name' => 'page-2-name', ] ] ID based index:
      If an object is listed in $defaults['index-id'] the id will be the key in the results. Currently, no items are listed as defaults for id-based index:
      // Set pages to get ID based index: $modules->get('PageQueryBoss')->defaults['index-id']['Page']; // Example return array: $pagesById = [ 123 => [ 'title' => "Page one title", 'name' => 123, ], 124 => [ 'title' => "Page two title", 'name' => 124, ] ] Number based index
      By default, a couple of fields are transformed automatically to contain numbered indexes:
      // objects or template names that should use numerical indexes for children instead of names $defaults['index-n'] => [ 'Pageimage', 'Pagefile', 'RepeaterMatrixPage', ]; // example $images = [ 0 => [ 'filename' => "image1.jpg", ], 1 => [ 'filename' => "image2.jpg", ] ] Tipp: When you remove the key 'Pageimage' from $defaults['index-n'], the index will again be name-based.
       
      Help-fill closures & tipps:
      These are few helpfill closure functions you might want to use or could help as a
      starting point for your own (let me know if you have your own):

      Get an overview of languages:
          $query = ['languages' => function($page){         $ar = [];         $l=0;         foreach (wire('languages') as $language) {             // build the json url with segment 1             $ar[$l]['url']= $page->localHttpUrl($language).wire('input')->urlSegment1;             $ar[$l]['name'] = $language->name == 'default' ? 'en' : $language->name;             $ar[$l]['title'] = $language->getLanguageValue($language, 'title');             $ar[$l]['active'] = $language->id == wire('user')->language->id;             $l++;         }         return $ar;     }]; Get county info from ContinentsAndCountries Module
      Using the [ContinentsAndCountries Module](https://modules.processwire.com/modules/continents-and-countries/) you can extract iso
      code and names for countries:
          $query = ['country' => function($page){         $c = wire('modules')->get('ContinentsAndCountries')->findBy('countries', array('name', 'iso', 'code'),['code' =>$page->country]);         return count($c) ? (array) $c[count($c)-1] : null;     }]; Custom strings from a RepeaterTable for interface
      Using a RepeaterMatrix you can create template string for your frontend. This is
      usefull for buttons, labels etc. The following code uses a repeater with the
      name `strings` has a `key` and a `body` field, the returned array contains the `key` field as,
      you guess, keys and the `body` field as values:
          // build custom translations     $query = ['strings' => function($page){         return array_column($page->get('strings')->each(['key', 'body']), 'body', 'key');     }]; Multilanguage with default language fallback
      Using the following setup you can handle multilanguage and return your default
      language if the requested language does not exist. The url is composed like so:
      `page/path/{language}/{content-type}` for example: `api/icf/zurich/conference/2019/de/json`
       
          // get contenttype and language (or default language if not exists)     $lang = wire('languages')->get($input->urlSegment1);     if(!$lang instanceof Nullpage){         $user->language = $lang;     } else {         $lang = $user->language;     }     // contenttype segment 2 or 1 if language not present     $contenttype = $input->urlSegment2 ? $input->urlSegment2 : $input->urlSegment1;     if ($contenttype === 'json') {         header('Content-type: application/json');         echo $page->pageQueryJson($query);         exit();     } Debug
      The module respects wire('config')->debug. It integrates with TracyDebug. You can override it like so:
      // turns on debug output no mather what: $modules->get('PageQueryBoss')->debug = true; Todos
      Make defaults configurable via Backend. How could that be done in style with the default queries?
      Module in alpha Stage: Subject to change
      This module is in alpha stage … Query behaviour (especially selecting child-templates, renaming, naming etc)  could change
    • By joelplambeck
      Hi Guys,
      I'm trying to do my first migration to the customers existing server (IIS 10) . I ran the site as a subdirectory on my website for test purposes (everything works fine).
      Following the tutorial of Joss, I tryed the site on a local xampp server to make sure, it also works on a root directory. So far so good, everything works.
      Now I moved the files (from the xampp) to the customers server. The root/index page is shown but for every subpage i get 404 Errors...
      Hence I followed the troubleshooting guide for not working URLs:
      On the first sight, the .htaccess file is not recognized, therefore I contacted the host support. They said, it is recognized but not all modules are supported in the processwire .htaccess file. I did the "öalskjfdoal" test in the .htaccess file and didn't get a 500 Error.... BUT the rewrite rule from the hosts support, to proof the file is read, DID work... The support claims, they do not provide debugging... so basically the .htaccess file is recognized and working, but not throwing any errors (for whatever reason).
      Working rewrite rule (from support):
      RewriteEngine On RewriteBase / RewriteRule ^test\.asp$ index.html [NC,L] RewriteRule ^test\.html$ konzept.html [NC,L] RewriteRule ^test2\.html$ team.html [NC,L] The support said, a couple modules are not supported in the htaccess file, the supported ones are listed here: http://www.helicontech.com/ape/ (I think mod_rewrite is supported)
      As I do not completely understand what exactly is happening in the htaccess file, I'm stuck. I tried all suggestions I found regarding this topic on the forum, but none of them solved the problem.
       
      .htaccess.txt
    • By louisstephens
      So I have a project coming up soon and one of the goals was to use Google's AMP project for the project's mobile site. I have gone through the tutorials and think I have a good grasp on the matter, but there is still one roadblock I do not really know how to tackle. The site, which uses a responsive grid system, will look great on a mobile and desktop which was is fine by me. However, if a user comes from Google to one of our AMP pages (ie www.example.com/amp) and clicks on a url, they will then be loading our responsive mobile pages and not the amp pages. For the sake of consistency, we really want to "force" users to stay on all the amp pages.
       
      My current thoughts on how to set up this task:
      Allow url segments for all pages using "/amp" Using a simple if statement, load the amp page if it exists <?php if($input->urlSegment1) { // add "&& $input->urlSegment1 == 'amp'" if you've more urlSegments include("partials/amp-page.php"); } else { include("partials/normal.php"); } ?>  
      However, I have hit a roadblock on appending "/amp" to all pages if they came to an amp page via Google, or even if they are on mobile and visit the site. Is this even possible to do, or should we just use the amp pages (if a user comes from google) and allow them to be active on our mobile pages?  We are just trying to give the fastest load times possible, as well as give a consistent look between mobile and desktop versions. As always, I really appreciate the ideas and help.