Jump to content

Weighted selector for preferred template


formulate
 Share

Recommended Posts

Not sure how best to go about this and I've tried a few things without success.

I'm trying to construct a $pages->find() with a selector that will give preference to a specific template over all others. Ideally I'm trying to get a good mix of "Template A" and "Template B" pages where normally I would just get all Template B results out of thousands of pages even though my specific search terms are part of Template A pages as well.

Standard find would be something like: $pages->find("template=templateA|templateB,title*=searchTerm,limit=10")

How do I re-do the above to prefer templateB results, but not exclusively?

Link to comment
Share on other sites

@formulate Refer to my post here, specifically under the search section which contains a link to a commit that should be here.

In short, that minor change opens up the possibility to do multiple $pages->find and stack them one after another, while maintaining pagination.  This means you can use multiple finds and effectively build search results that are ranked.  I used this technique very successfully for the site I showcased and if I'm understanding your post, I think that's what you're ultimately looking to do.

  • Like 1
Link to comment
Share on other sites

Example:

$finds = [];

$finds[0] = $pages->find("title*=foo", ['findIDs' => 1]);
$finds[1] = $pages->find("title*=bar,id!=".implode("|", $finds[0]), ['findIDs' => 1]);
$finds[2] = $pages->find("title*=baz,id!=".implode("|", array_merge($finds[0],$finds[1])), ['findIDs' => 1]);
// etc...

// final query that gets ids from 0, then 1, then 2
$final = $pages->find("id.sort=".implode("|", array_merge($finds[0],$finds[1],$finds[2])));

 

Edited by Jonathan Lahijani
edit: used wrong variable.
  • Like 2
Link to comment
Share on other sites

Nice, thx for sharing! Just played around a little with your code example... This version is using array-syntax ? 

$find0 = $pages->findIDs("title*=home");
$find1 = $pages->findIDs([
	["title", "*=", "admin"],
	["id", "!=", $find0],
]);
$find2 = $pages->findIDs([
	["title", "*=", "modules"],
	["id", "!=", array_merge($find0,$find1)],
]);
$final = $pages->find(["id.sort" => array_merge($find0, $find1, $find2)]);
  • Like 3
Link to comment
Share on other sites

Thanks for the suggestions.

I don't think I explained myself well enough. I'm really not concerned with pagination at this point. What I'm ultimately trying to achieve is a blend of ranked results from two different templates, but not in the usual "most relevant" sorting method PW uses by default. I still want "Most Relevant" but I want more results to be pulled from Template A than I do Template B. A real simplistic explanation would be something like: for every 2 Template A results, get a Template B result. Not quite literally as 2:1, I still want a variable blend based on relevancy, I just want more results of Template A showing up than B.

Some context: I have a website that is both information pages and eCommerce product pages. I want search results to always contain eCommerce products as well as information pages, based on relevancy. My problem is that there are thousands of information pages so when I search for something, those information pages dominate the results and I don't end up seeing an eCommerce product until page 2 or 3 of the results.

Link to comment
Share on other sites

I think one way of doing what you're looking for would be to do two finds - one for products and one for other pages and then merge them, because you need to override the 'organic' results somewhat, so you'll need to impose some kind of artificiality otherwise you'll just get what you're getting now.. 

Say you want 10 results, get 6 products and 4 info pages (or 7 & 3 or whatever) and then rank that combined 10 by relevancy. 

Another way would be to just always show say 3 products at the top, kind of like Google ads at the top of SERPS, but you wouldn't need to identify them as ads. Yet another way, on wide enough screens, might be to have a column for each.

  • Like 1
Link to comment
Share on other sites

DaveP, what you're suggesting is in fact what I'm doing now, or at least close to it. However, I'm having to randomize the combined 10 as I can't figure out how to rank by relevancy between the two arrays. At least I can't see any info regarding relevancy that's returned by PW (maybe I missed it in the arrays). I can live with this solution, so long as I can figure out relevancy ranking with the merged results. It's not ideal and seems like a hack workaround, but it would suffice in this instance.

Just seems weird to me that there's no easy way to tell PW that Template A results are more relevant than Template B.

Link to comment
Share on other sites

@Robin S I tried your module and all it seems to be doing is the same as $pages->add(). The results from my second selector are just being appended on to the results of the first, not actually merging the two together. I must be doing something wrong? Here's my code:

$selectors = [
	"productSKU|productModel|productShortName|brand|title|summary|content~=$q,template=product,limit=8",
    "title|summary|content%=$q,template=video|btt|article|item|case-study|testimonial|diagram,limit=12"
];
$options = [
	'limit' => 20
];
$results = $pages->findMerge($selectors, $options);

 

Link to comment
Share on other sites

4 minutes ago, formulate said:

The results from my second selector are just being appended on to the results of the first, not actually merging the two together.

Yes, that's exactly what it does. It works through the supplied selectors in order and appends the results of each to the last (with the provisos covered in the readme) but the returned PageArray is efficiently paginated which is important for performance if you are dealing with large numbers of results. In your case you are only retrieving 20 results which you're not paginating so you don't need this module for that scenario.

Link to comment
Share on other sites

@formulate, it sounds like you want to have some prioritised subset of the results interleaved with the rest of the results, so they are distributed rather than grouped near the end where they would otherwise appear. There isn't any "sort" value you can use in a selector to do this because sort can only work by ordering results based on some specific column in the database, and no such column exists for your case.

So like @DaveP said, you need to get all the results and then apply the sort order yourself. But when you have large numbers of results you want to avoid loading all the results in a PageArray. It's more efficient to work with just the IDs and then get paginated results from those IDs, and to do that you can use the id.sort feature that @Jonathan Lahijani mentioned.

Here's a demo of how you might do this. This is simplified to just use templates to distinguish between the priority results and the other results but you will be able to adapt it to your case.

// Do the custom sorting using only page IDs for efficiency
$priority_ids = $pages->findIDs("template=colour");
$other_ids = $pages->findIDs("template=country");
$merged_ids = [];
// Loop over the $other_ids (this should be the larger of the two arrays)
foreach($other_ids as $key => $id) {
	// Use modulo operator to insert one priority ID for every two other IDs
	if($key % 2 === 0) {
		$merged_ids[] = array_shift($priority_ids);
	}
	$merged_ids[] = $id;
}
// If there happen to be any priority IDs left over when the loop is finished, add them at the end
if(count($priority_ids)) $merged_ids = array_merge($merged_ids, $priority_ids);

// Now use id.sort to get pages matching the IDs in paginations of 20
$results = $pages->find([
	'id.sort' => $merged_ids,
	'limit' => 20
]);

2022-08-03_120559.png.5fd358898a19d926e6ad300fe17bfbee.png

  • Like 2
Link to comment
Share on other sites

@Robin S Ok, I played around with this and can 100% determine that in your code, the following line is causing the issue:

'id.sort' => $merged_ids,

The $merged_ids array is in the correct order before finding all the pages using the ids. Once that $pages->find executes with the id.sort, it's messing up the order again based on PW/MySQL's own determination of relevancy for sorting. I guess I could loop through the merged ID's array and grab each one with an individual $pages->get, but that seems ultra wasteful.

Link to comment
Share on other sites

5 hours ago, formulate said:

Once that $pages->find executes with the id.sort, it's messing up the order again based on PW/MySQL's own determination of relevancy for sorting.

If you can create a simplified demonstration case you could open a core GitHub issue because id.sort is supposed to cause results to be in the order of the supplied IDs.

For now you could follow the approach used in my FindMerge module, where array_slice() is used to get a slice of the IDs according to the current page number: https://github.com/Toutouwai/FindMerge/blob/cbc6f43138508a52ae095c7041dbb969cc7d7bf7/FindMerge.module#L86-L99

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...