Jump to content

how to transform page selectors into sql queries?


bernhard
 Share

Recommended Posts

hi all!

in my recent project i needed to query the database directly because of some heave datatables calculations that where too slow via $pages->find() operation in a foreach.

so far, so good - everything works fine now. but the query got quite complex, hard to setup and hard to read (maintain). the problem is that you have to join all the fields to get all fields of one template. and this join operations can get quite confusing when you have lots of fields.

any hints how i could simplify this setup?

Link to comment
Share on other sites

$pf = $pages->getPageFinder();
$selector = new Selectors($selector);
$query = $pf->find($selector, ['returnVerbose' => true, 'returnQuery' => true]);

# Show sql
$query->getQuery();

# Modify query
$query->where($sql);

# Run query
$query->execute();

 

  • Like 8
  • Thanks 1
Link to comment
Share on other sites

hmmm

thank you lostkobrakai for the quick response! i think my topic title was misleading...

the problem is, that i have multiple queries in a foreach that i need to transform into a single sql query:

$invoices = [];
foreach($pages->find('template=invoice') as $item) {
    $invoices[] = [
        'num' => $item->id,
        'client' => $item->client->title,
        'net' => $item->getNet(),
        'gross' => $item->getGross(),
    ];
}

// output table of all invoice

num | client       | net    | gross |
---------------------------------------
01  | john doe     |    100 |   120 |
02  | max muster   |    200 |   240 |

getNet and getGross would be themselves a method like this looping all pagetable items:

$sum = 0;
foreach($invoice->items as $item) $sum += $item->net;
return $sum;

in my case i built a database view with all invoiceitems so that i can select everything easily and build sums quickly via sql sum(net) and not a slow foreach() $sum+=...

the problem is just that is was really not easy to do all the joins and find the right fields and join them etc. maybe your code example could have saved me some time... or maybe there is an even better way? :)

thank you again for your help!

Link to comment
Share on other sites

There probably isn't. The pagefinder and Pages::getById and Pages::getFieldValue are the places where processwire does query the db for page data, with the first probably being the most elaborate piece. 

Converting your foreach into a single query is probably a bit difficult (selection and retrieval), but maybe 2 is enough for you. Use $pages->findIDs() and then your custom query with "WHERE id in (1000, 1001, …)"

  • Like 1
Link to comment
Share on other sites

19 minutes ago, LostKobrakai said:

Converting your foreach into a single query is probably a bit difficult

yes, that was the reason for my question :) i already have a query and it works well. it creates a db-view of all my data and then i can run an efficient query on this view easily without joining everything again manually.

i think your pagefinder example code will save me some time next time together with tracy console panel :)

  • Like 1
Link to comment
Share on other sites

  • 10 months later...
On 5/18/2017 at 8:01 AM, bernhard said:

i think your pagefinder example code will save me some time next time together with tracy console panel

I know you know about this now @bernhard but others may not and since I just stumbled across this post, I thought I post this which shows running a $pages->find in the console panel and then viewing the resulting SQL Query in the "Selector Queries" section of the Debug Mode panel (from the ajax bar). Hope that helps others who come across this thread.

image.png.30aceea0fb765a1e13dfc4d1173e79cb.png

  • Like 6
  • Thanks 1
Link to comment
Share on other sites

  • 4 weeks later...
On 5/18/2017 at 10:11 AM, LostKobrakai said:

$pf = $pages->getPageFinder();
$selector = new Selectors($selector);
$query = $pf->find($selector, ['returnVerbose' => true, 'returnQuery' => true]);

# Show sql
$query->getQuery();

# Modify query
$query->where($sql);

# Run query
$query->execute();

 

@LostKobrakai I've been playing around with your example here and it's really neat. I've been looking for a way to add or modify a selector with custom SQL. It looks like your onto something here. However, I can't figure out what to do after running the query. How to I convert the results into a PageArray?

Also, I noticed in studying the core a bit that this technique bypasses some of what would normally occur in a $pages->find(), including (I think) setting up the pagination and the loadOptions.

In my particular use case I have a selector that does exactly what I want, but need to add a custom ORDER_BY for a more complex sort than PW allows.

Link to comment
Share on other sites

@bernhard This looks very cool, but I am not totally clear on what it does and how it does it. Can you give me a semi-technical explanation of how your module works (inputs and outputs) and where it plugs into the core?

What exactly does a call to RockFinder return? Is there a way to get page objects from the results if I don't need the scalability features of the module and just want a normal page array in the end? Can I still use pagination with the results?

Also, as cool as your module looks (and it looks very cool), if what I'm wanting to do can be achieved easily with just the core, I'd prefer to keep it simple.

Link to comment
Share on other sites

hi @thetuningspoon, my module is a companion module for RockGrid. Those grids always need an array of objects as data source. Using $pages->find() and then foreach() to create such an array is terribly inefficient when dealing with several thousands of pages. That's why I built RockFinder -> it creates an efficient SQL statement for you and it is almost as easy as using regular PW selectors.

The benefit of using SQL is that it is incredibly efficient. The downside is that some pw-related tasks become a headache (like checking for published state, sort order, multilanguage field values, to name just a few). RockFinder takes care of all that.

It returns an array of objects that you can easily pass to a RockGrid or do whatever you want with it. See this example:

screencapture-360-test-admin-setup-rockfindertester-2018-05-07-16_12_27.thumb.png.d424c6d1d196bd5e645946b078b42bc2.png

The first array item is this (first row in the screenshot):

dump.png.7ccb2c1c1be57d2bf033e25a0d59eaea.png

Reading your question again RockFinder might not be the best fit for you. It does NOT support any kind of pagination (as this is done by RockGrid) and it does not return a pagearray. Though there is the option of using closures (see here) for using the pw api to return values.

  • Like 2
Link to comment
Share on other sites

21 minutes ago, thetuningspoon said:

Is it creating a page object for each result?

No, as I said it will always return an array of objects. Or the plain sql query that you can use for later modifications. The reason why I built RockFinder was that I didn't want to load all page objects into memory. Therefore it does not return page objects...

When using closures it actually loads all pages into memory, but it still returns a plain array of data. I don't see any reason to return a pagearray, that's the job of $pages->find() and all the other api methods...

rockfinder.png.2935a2760332b0c30b0565113208e25d.png

ps: you could easily loop the returned array and create pages as needed. all entries have the page id as you can see in the screenshot above.

Link to comment
Share on other sites

Quote

No, as I said it will always return an array of objects.

Yes, I understand. I meant how does it work under the hood. I guess path is a dynamic page property, so it requires constructing the page object to get it?

 

Quote

ps: you could easily loop the returned array and create pages as needed. all entries have the page id as you can see in the screenshot above.

You mean with a $pages->get()? Wouldn't that mean going back to the database again to build each page?

 

Anyway, thank you for explaining further. At this point I am wondering if you can provide any insight to my original question, which was how I can modify the sql of a regular $pages->find() and then return the results as a PageArray.

Link to comment
Share on other sites

This seems to do it:

		$pf = $this->pages->getPageFinder();
		$selector = new Selectors($selector);
		$query = $pf->find($selector, ['returnVerbose' => true, 'returnQuery' => true]);

		$statement = $query->execute();
		$statement->execute();

		$ids = array();
		while($row = $statement->fetch()) $ids[] = $row[0];
		$myPages = $this->pages->getById($ids);

 

Is this how PW constructs the PageArray during a regular $pages->find()? So even if you were autojoining all your fields, it is still doing one query to find the matching pages and then another separate query for each page to load the desired fields?

Link to comment
Share on other sites

No, this is not the way pw does it, because pw needs more options in the find() method, handling shortcut calls from findMany(), findIDs() etc.; But your method is perfectly fine :)

You could try to modify the query and add restrictions to the query array:

finder.thumb.png.ecc0c28db262498edccd285a93e988aa.png

you could do something like this (you have to try&error, not sure how/if that works):

$query->orderby('yourorderby statement');

 

Link to comment
Share on other sites

What I meant was whether PW was using the same basic flow: Converting a selector to an SQL select statement, getting the IDs of the matching pages, and then calling getById() to load the actual pages into a PageArray.

I spent this morning doing a deep dive into the core and have confirmed that this is how it's working.

I was also able to simplify the example by @LostKobrakai to the following:

$pf = $this->pages->getPageFinder();
$query = $pf->find($selector, ['returnQuery' => true]);

# Show sql
//$query->getQuery();

# Modify query
//$query->where($sql);

$statement = $query->execute();
$statement->execute();

# Load the pages
$ids = array();
while($row = $statement->fetch()) $ids[] = $row[0];
$myPages = $this->pages->getById($ids);

 

I haven't solved the pagination side of things yet. Unfortunately PagesLoader::find() is doing quite a bit of work that we're not able to take advantage of due to the fact that we have to bypass it completely and go straight to PagesLoader::find() in order to get the DatabaseQuerySelect() object. I'm not sure if this problem can be solved without modifying the core or duplicating a lot of its code.

For future reference, this is the basic flow of a Pages::find() call (sans various options and some intermediary methods):

  1. Pages::find()
    1. Does nothing on its own. Delegates to PagesLoader::find()
  2. PagesLoader::find()
    1. Checks for page in runtime cache, returns cached page if it's in memory
    2. Creates a selector object from your selector string (PageFinder::find() can also do this, as I discovered)
  3. PageFinder::find()
    1. Turns selector object/string into a ProcessWire DatabaseQuerySelect object (via PageFinder::getQuery())
    2. Turns DatabaseQuerySelect into an SQL select statement and queries the database
    3. Returns a multidimensional array with the ID, parent ID, and template ID of each matching page (OR a ProcessWire DatabaseQuerySelect object if the $returnQuery option is true)
  4. PagesLoader::getById()
    1. Takes an array of IDs
    2. Creates a page object for each ID and populates it with fields from the database (an additional database query). This is where any autojoin fields are pulled from the database.
  5. PagesLoader::find() 
    1. Sorts pages by  template (?)
    2. Sets up pagination
    3. Returns final PageArray
  • Like 4
Link to comment
Share on other sites

14 hours ago, thetuningspoon said:
  • Pages::find()
    1. Does nothing on its own. Delegates to PagesLoader::find()
  • PagesLoader::find()
    1. Checks for page in runtime cache, returns cached page if it's in memory
    2. Creates a selector object from your selector string (PageFinder::find() can also do this, as I discovered)
  • PageFinder::find()
    1. Turns selector object/string into a ProcessWire DatabaseQuerySelect object (via PageFinder::getQuery())
    2. Turns DatabaseQuerySelect into an SQL select statement and queries the database
    3. Returns a multidimensional array with the ID, parent ID, and template ID of each matching page (OR a ProcessWire DatabaseQuerySelect object if the $returnQuery option is true)
  • PagesLoader::getById()
    1. Takes an array of IDs
    2. Creates a page object for each ID and populates it with fields from the database (an additional database query). This is where any autojoin fields are pulled from the database.
  • PagesLoader::find() 
    1. Sorts pages by  template (?)
    2. Sets up pagination
    3. Returns final PageArray

To append to that:

Autoloaded fields are retrieved by getById() as you said, but non-autoloaded fields are retrieved on first access of the fields property from the page ($page->my_field) as a separate query. This is an important fact to consider when needing performance, as that default behaviour is quite prone to cause n+1 queries.

  • Like 2
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...