Jump to content

PW 3.0.172 – Find faster and more efficiently


ryan
 Share

Recommended Posts

@adrian It's like an index for page URLs/paths. Usually PW has to join every page in a query in order to know its URL. So the PagePaths module provides a more direct and theoretically faster route. But in my experience, it's not really faster until the URLs get long (like in cases where PW would have to do lots of joins to determine the URL otherwise). The other thing it does is that it lets you perform $pages->find() partial text matching operations on the url/path, which you can't do otherwise. The only place where it adds overhead is if you change the name of a parent page, it then has to re-index all the URLs for everything below the parent. It's not installed automatically because most people don't need it, and it doesn't support multi-language URLs. But it's handy to have when it crosses over with your needs. 

I'm good to provide an option for findRaw to return an array of basic objects if one requests it in the $options. But for most I would suggest sticking to the array because it's good for it to be clearly different in syntax from a regular page. That's because it's all raw and unformatted data, so it's not going to be safe to swap between find() and findRaw() in most cases and good to maintain clear differentiation. For instance, when using something from findRaw() for output, you've got to be sure to entity encode anything you output, etc. Plus, one reason for findRaw() is to provide the lowest level path to the raw data, and I think a PHP array is probably the lowest level, least overhead way of doing that. But having an option/alternative for a std object seems fine to me. 

  • Like 6
Link to comment
Share on other sites

44 minutes ago, ryan said:

it lets you perform $pages->find() partial text matching operations on the url/path, which you can't do otherwise.

That "partial" matching was the bit I didn't understand - the module description doesn't make that clear so I never figured the limitation without it.

45 minutes ago, ryan said:

it doesn't support multi-language URLs

I assume this doesn't break any existing features - just means you can't do partial match selectors on ML URLs, rather than breaking them completely?

47 minutes ago, ryan said:

That's because it's all raw and unformatted data, so it's not going to be safe to swap between find() and findRaw()

I see your point - I hadn't considered the raw and unformatted nature of it, but  I still think it feels weird to be dealing with arrays in data coming directly from PW methods, but it's not a big deal.

Link to comment
Share on other sites

@ryan - sorry to keep hassling, but I am wondering about your thoughts on my idea for a referencesRaw() type method. You gave my suggestion a like, so I hope it's on your roadmap. It will be invaluable to me now that we have support for getting the URL with "raw" finds.

As an aside, would it actually make more sense to add a "references" selector to find() / findRaw(), eg:

$pages-findRaw('references=1234')

 

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

I had a thought about findRaw from a use-case I have (a flag set on pages via $page->meta() on save).

I don't know how feasable it is, but it would neat if it was possible to join the contents of 'pages_meta' onto findRaw/getRaw queries.  Nothing fancy, all it would need to do is return the raw data (just like any fieldtype that gets joined) as an array indexed by meta key.

$p = $pages->findRaw('template=foobar, parent=1234', ['title', 'pages_meta']);

/* Would return, for example:
[5678] => [
	'title' => 'The Page Title',
    'pages_meta' => [
        'foo' => 'Plain String',
		'bar' => '{"a":"JSON","b":"String"}'
]
*/

 

Link to comment
Share on other sites

  • 3 weeks later...
On 3/8/2021 at 7:47 PM, ryan said:

@adrian @bernhard I've added support for getting 'url' and 'path' from $pages->findRaw() in the latest commit, but it requires the PagePaths module be installed. Now just need multi-language URL support for that module.

ML support in the PagePath module would be great!

Link to comment
Share on other sites

UPDATE 20 Apr. 2021: This issue has now been fixed.

On 2/17/2021 at 6:11 PM, Jan Romero said:

I just noticed there still seems to be an issue with the join/field selector. If used with pagination it appears to add the limit/pagesize to the total number of results, which distorts pagination. One too many pages are shown, the last of them being empty.

For example, I have this selector:


 template=show, datum>=2019-01-01, datum<2020-01-01, has_parent=1234, sort=-datum, limit=50

With the join fields getTotal() gives me 102, without only 52 (one show per week, makes sense). Interestingly, on the second pagination page, getTotal() will say 54 (?!) and the superfluous 3rd page disappears.

I have just encountered the exact same issue here when trying to paginate a PageArray which has autojoined fields -- without autojoin the pagination works as expected, with it getTotal() shows the wrong number.

This is with the newest DEV branch version of ProcessWire: ver. 3.0.175

Edited by LMD
Issue fixed
Link to comment
Share on other sites

  • 5 months later...
On 2/7/2021 at 11:42 AM, apeisa said:

Great additions - many of these are things that we have setup our own solutions when hitting performance issues on large scale (10 000+ pages with lots of fields) so it is great to see core supporting this kind of things. Thank you Ryan!

Quick note on findRaw - it looks like it doesn't support parent? It would be very powerful to be using parent just like any normal field based page relation.

I also need support for joining parent fields!

  • Like 1
Link to comment
Share on other sites

  • 2 months later...
  • 11 months later...

Hi all,

Is anyone else having issues with findJoin?

If I try to join a checkbox it doesn't work at all and if I try to join a page reference field

foreach($pages->findJoin('template=date', 'relationships') as $date) {
    d($date);
}

I end up with this which appears to be unusable:

image.thumb.png.f4e7ff95e7048a056cc076d34c654f9e.png


But if I do the old school auto-join (the checkbox in the field settings), then I get the expected:

image.thumb.png.a5ea487fbcde8281a53444c38fd9a94a.png

Can anyone confirm this behavior in the latest dev version? Or do you know what I am doing wrong?

Thanks for any input!

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...