Jump to content

Complex sorting + pagination


bookie
 Share

Recommended Posts

I've been trying to solve this one by myself for a while now but I think my unfamiliarity with the grammar of php/processwire is keeping an obvious solution from my grasp.

I have a collection of items that need to be sorted by variables of descending priority. I can figure that part out. I also have a general idea of how to sort using a single variable and then paginate the results. But I can't figure out how to both create a complex sort function and then properly apply a limit.

Here is my sort, taken mostly from another solution offered on the forums (that I forgot to bookmark)

// Create Sort Hierarchy 
$subjectsort = $pages->find("template=book, catalog={$page->title}, subject!=''")->each(function($sort_subject){
  $sort_subject->srt1 = $sort_subject->subject;
  if($sort_subject->author){
    $sort_subject->srt2 = $sort_subject->cleanauthor;
    $sort_subject->srt3 = $sort_subject->cleantitle;}
  else{
    $sort_subject->srt2 = $sort_subject->cleantitle;
  }
});
$authorsort = $pages->find("template=book, catalog={$page->title}, subject='', author!=''")->each(function($sort_author){
  $sort_author->srt1 = $sort_author->cleanauthor;
  $sort_author->srt2 = $sort_author->cleantitle;
});;
$titlesort = $pages->find("template=book, catalog={$page->title}, subject='', author=''")->each(function($sort_title){
  $sort_title->srt1 = $sort_title->cleantitle;
});;


$subjectsort->add($authorsort)->add($titlesort)->sort('srt1, srt2, srt3'); //concatinates all items then sorts them

If I simply then list out the results of $subjectsort everything is ordered correctly.

But if instead of the final line above I try something like:

$items = $pages->find("template=book, catalog={$page->title}, sort=$srt1, $srt2, $srt3, limit=50");

The resulting array is simply sorted by page title, which is radically different from cleantitle. 

And if I try 

$items = $pages->find("template=book, catalog={$page->title}, sort=srt1, srt2, srt3, limit=50");

I get a fatal error (Unknown Selector operator: '[empty]')

Out of curiosity I tried 

$items = $subjectsort("limit=50");

Which got me a list of 50 completely blank items.

I don't know if it's possible to add a limit to the concatenation of $subjectsort after the fact, and I don't know if it's possible to carry over the prioritization done in the code above to sort via a new find function later. It seems that "limit" is limited in where it operates. I wonder if it would be better to make srt1 etc. permanent fields that are modified in a module hooked to saving a page, but I haven't touched modules yet, and don't want to add to the problem pile just yet.

Any ideas of where I'm going wrong? Thanks a ton for all the help so far.

 

Link to comment
Share on other sites

When you do a $pages->find($selector) operation your selector is translated by PW into an SQL query. It's a bit of a simplification (because there some exceptions for things like "count"), but essentially this means that you can only sort by values that correspond to columns in the database. So for example you can sort by "title" because that corresponds to a column in the database. But you can't use some logic at runtime to conditionally build up a custom value based on various properties of each page and then sort by that because there is no corresponding column for that custom value.

So your options are...

1. If the only way to work out the sort is to iterate over all the pages in a PageArray and you then want to paginate that sorted PageArray, you can do something like this:

// $results is a PageArray that you have applied some custom sorting to at runtime

$total = $results->count();
$limit = 10;
// Convert page number to be zero-based
$page_num = $input->pageNum - 1;
$start = $page_num * $limit;
// Get the slice of results for the current pagination
$results = $results->slice($start, $limit);
$results->setStart($start)->setLimit($limit)->setTotal($total);

foreach($results as $result) {
	// Output result
}
echo $results->renderPager();

But because you have to load all the results into memory in order to sort them, this strategy is not going to scale well with large numbers of results.

 

2. If the sorting doesn't actually depend on dynamic input you can think about using an integer field in the template to store a weighting that you can sort by in a $pages->find() selector. You would use a saveReady hook to populate this sort_weighting field according to other field values in the page. You can use the API to loop over all your book pages to set a sort_weighting value initially.

The sorting in your code didn't quite make sense to me so you'll probably need to adapt this to suit, but the general idea is this:

// In /site/ready.php
$pages->addHookAfter('saveReady', function(HookEvent $event) {
	$page = $event->arguments(0);
	if($page->template == 'book') {
		$sort_weighting = 0;
		if(!$page->subject) $sort_weighting = -1;
		if(!$page->subject && !$page->author) $sort_weighting = -2;
		$page->sort_weighting = $sort_weighting;
	}
});

// In your template code
$results = $pages->find("template=book, sort=sort_weighting, sort=subject, sort=cleanauthor, sort=cleantitle, limit=50");

 

  • Like 3
Link to comment
Share on other sites

15 hours ago, Robin S said:

When you do a $pages->find($selector) operation your selector is translated by PW into an SQL query. It's a bit of a simplification (because there some exceptions for things like "count"), but essentially this means that you can only sort by values that correspond to columns in the database.

I suspected the problem was something like that. 

Re: solution 1, the list I'm rendering consists of 700 pages, so load time is the primary driver of why I'm looking for a pagination solution. It looks like this option will have the same load time issues as the single page setup I've currently got going.

Re: solution 2, if I understand you correctly a single sort_weight variable won't do the trick, because the sort does depend on dynamic input. Different catalog pages will have a different collection of child pages rendered, and their positions can change.

I'll explain what the sorting is intended to do:

The template running this code produces a list of books. Those books always have a title, but they may or may not have an author and/or subject.

When sorting, we want books to be listed by subject, but if there is no subject, then primarily by author, but if there is neither subject nor author, then by title. When subject/author is identical between books we want items to then be properly sorted within that subset by the next highest priority field.

So for example a set of books might be sorted as below:

1. [American Fiction] Frost, Robert. Poems.

2. [American Fiction] Twain, Mark. Huckleberry Fin.

3. Christie, Agatha. And then there were None.

4. Report on the Mississippi River Valley.

If you are wondering why the data is as convoluted as it seems to be (why not give everything a subject?), the answer is the traditions of the field I'm in are centuries older than modern computing and have no concerns over the misery of the poor person trying to make sense out of it.

Previously I had a sort that looked like this:

  foreach($items as $tempsort){
    $tempsort->set('sortval', $tempsort->subject ?: $tempsort->cleanauthor ?: $tempsort->cleantitle); // Runtime field
    }
  $items->sort("sortval, date");

This is imperfect because after something is sorted by subject, the items that share a subject are not subsequently sorted by author since it's only sorting once, by a single assigned value. So I created a tier-based sort so that every book would be sorted by primary, secondary, and tertiary terms. 

What I'm thinking is the most effective way is to make permanent fields for srt1, srt2, and sr3 that are generated on saving the page. I've never used hooks before and am still a little apprehensive about it, but if pages did have those fields baked into the database then would the below work as expected?

$items = $pages->find("template=book, catalog={$page->title}, sort=srt1, srt2, srt3, limit=50");

 

Link to comment
Share on other sites

Perhaps I'm missing something, but couldn't you do something like this in a saveReady hook?

$pages->addHookAfter('saveReady', function(HookEvent $event) {
	$page = $event->arguments(0);
	if($page->template == 'book') {
		$page->my_sort_field = implode(' ', array_filter([
			$page->subject,
			$page->cleanauthor,
			$page->cleantitle,
		]));
	}
});

... and now that your sort field (which could be a hidden text field, or a textarea if there's a lot of data, which I assume is not the case) contains a single value, you can sort results by that. As far as I can tell the only issue here is that you'll basically have the same data stored multiple times ?‍♂️

(Unless I've misunderstood you, and you actually want to keep all books that have a subject before any of those that don't, etc. But in that case a simple sort by multiple field seems enough.)

  • Like 3
  • Thanks 1
Link to comment
Share on other sites

@teppo, Please bear with me here as I have 0 education in php and am figuring it out as I go. I had never heard of implode before but that looks like it could be what I need.

If I understand what the purpose of the code is, you're creating a single string that combines subject, cleanauthor and cleantitle. If there's no subject, the string simply ignores that value and starts at cleanauthor?

Or if

subject = American Fiction

author = [null]

title = Some anonymous book

The resulting string would be "American Fiction some anonymous book" ?

That sounds like it would probably work, actually. There is some redundancy but that's been the story of my life with this project, as the display of information is very particular and the data behind it is very irregular. For example, cleanauthor and cleantitle are both temp variables derived from author and title fields. I would assume that before the implode I will need to define these as I do in the template. Will this cause problems when calling the variables with array_filter()? 

Link to comment
Share on other sites

2 hours ago, bookie said:

If I understand what the purpose of the code is, you're creating a single string that combines subject, cleanauthor and cleantitle. If there's no subject, the string simply ignores that value and starts at cleanauthor?

Or if

subject = American Fiction

author = [null]

title = Some anonymous book

The resulting string would be "American Fiction some anonymous book" ?

Yes, that's the intention.

2 hours ago, bookie said:

For example, cleanauthor and cleantitle are both temp variables derived from author and title fields. I would assume that before the implode I will need to define these as I do in the template. Will this cause problems when calling the variables with array_filter()? 

I actually assumed that these were field values, but if not, then yes — you'll have to define them here as well ?

This is a little off-topic, but if/when you store custom non-field values within a Page object, I would recommend prefixing them with an underscore. It's not a big deal, but can make it a little more obvious that these are not actual field values. Also, ProcessWire will never save a property such as $page->_some_key to the database, so this way it's impossible to accidentally overwrite some *real* (field) value.

  • Thanks 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...