Jump to content

RockFinder3 - Combine the power of ProcessWire selectors and SQL


bernhard

Recommended Posts

Hi! I was trying RF3 with PW 3.0.159 and I got the error that it is not compatible. I fixed it changing the <> on the RockFinder3.module.php

Lines 658 and 664

Also, the I tried a join with a single page reference field and only get the id field instead of "title", I used the fragment as base for the query:

Thank you!

$owners = $rockfinder
  ->find("template=person")
  ->addColumns(['title', 'age'])
  ->setName('owner'); // set name of target column
$rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner'])
  ->join($owners)
  ->dump();

rockfinder3-error.JPG

rockfinder3-error-1.JPG

rockfinder3-ok.JPG

  • Thanks 1
Link to comment
Share on other sites

Hi @Crowdland Technology Thx for the report - the wrong version compare is fixed in v1.0.7 ? 

1 hour ago, Crowdland Technology said:

Also, the I tried a join with a single page reference field and only get the id field instead of "title", I used the fragment as base for the query:

I'd need more info on that - what is the exact query you used? What is your setup? What is the output (dump+sql)?

Link to comment
Share on other sites

Don't know what I'm doing wrong but cannot join two RF3 statements when both have the field 'title'.

I based it on the example of:

<?php
$owners = $rockfinder
  ->find("template=person")
  ->addColumns(['title', 'age'])
  ->setName('owner'); // set name of target column
$rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner'])
  ->join($owners)
  ->dump();

My case is field 'booking_status' is a page reference field in template 'booking'.

<?php
$rockfinder = modules("RockFinder3");
$booking_status = $rockfinder
    ->find("parent.id=1154, include=all")
    ->addColumns(['title'])
    ->setName('booking_status');

$rockfinder
    ->find("template=booking")
    ->addColumns(['title'=>'booking', 'booking_status'])
    ->join($booking_status)
    ->dump();

Each one individually returns the correct data. However when I try to use the join, it errors with "SQLSTATE[42S02]: Base table or view not found: 1146 Table 'xxxxx_xxxx.field_title_5ee33d16934fa' doesn't exist". Both templates have a 'title' field.

The SQL dump is:

"SELECT
  `pages`.`id` AS `id`,
  `_field_title_5ee33bf3cb7d8_5ee33bf3cb854`.`data` AS `booking`,
  GROUP_CONCAT(DISTINCT `_field_booking_status_5ee33bf3cb92f`.`data` ORDER BY `_field_booking_status_5ee33bf3cb92f`.`sort` SEPARATOR ',') AS `booking_status`,
  GROUP_CONCAT(DISTINCT `join_booking_status_5ee33bf3cbbbe`.`title`) AS `booking_status:title`,
  GROUP_CONCAT(DISTINCT `join_booking_status_5ee33bf3cbbbe`.`booking`) AS `booking_status:booking`,
  GROUP_CONCAT(DISTINCT `join_booking_status_5ee33bf3cbbbe`.`booking_status`) AS `booking_status:booking_status`  
FROM `pages`  
LEFT JOIN `field_title_5ee33bf3cb7d8` AS `_field_title_5ee33bf3cb7d8_5ee33bf3cb854` ON `_field_title_5ee33bf3cb7d8_5ee33bf3cb854`.`pages_id` = `pages`.`id` 
LEFT JOIN `field_booking_status` AS `_field_booking_status_5ee33bf3cb92f` ON `_field_booking_status_5ee33bf3cb92f`.`pages_id` = `pages`.`id` 
LEFT JOIN (
  SELECT
    `pages`.`id` AS `id`,
    `_field_title_5ee33bf3cb5c3`.`data` AS `title`,
    `_field_title_5ee33bf3cb7d8_5ee33bf3cb854`.`data` AS `booking`,
    GROUP_CONCAT(DISTINCT `_field_booking_status_5ee33bf3cb92f`.`data` ORDER BY `_field_booking_status_5ee33bf3cb92f`.`sort` SEPARATOR ',') AS `booking_status`  
  FROM `pages`  
  LEFT JOIN `field_title` AS `_field_title_5ee33bf3cb5c3` ON `_field_title_5ee33bf3cb5c3`.`pages_id` = `pages`.`id` 
  LEFT JOIN `field_title_5ee33bf3cb7d8` AS `_field_title_5ee33bf3cb7d8_5ee33bf3cb854` ON `_field_title_5ee33bf3cb7d8_5ee33bf3cb854`.`pages_id` = `pages`.`id` 
  LEFT JOIN `field_booking_status` AS `_field_booking_status_5ee33bf3cb92f` ON `_field_booking_status_5ee33bf3cb92f`.`pages_id` = `pages`.`id` 
  WHERE (pages.templates_id=53) 
  AND (pages.status<1024)
  GROUP BY pages.id 
) AS `join_booking_status_5ee33bf3cbbbe` ON `join_booking_status_5ee33bf3cbbbe`.`id` = `_field_booking_status_5ee33bf3cb92f`.`data` 
WHERE (pages.templates_id=53) 
AND (pages.status<1024)
GROUP BY pages.id " (1933)
in .../RockFinder3/RockFinder3.module.php:504

I can add any field in the "template=booking" statement EXCEPT 'title'.

Using 'addRelationship' is even worse. It errors with PHP out-of-memory.

Using RF3 v1.0.7 and PW 3.0.158

Help appreciated.

 

 

 

Link to comment
Share on other sites

I've just noticed an issue thats cropped up before in other modules: Processwire allows mixed case field names, however they are always converted to lowercase in mySQL.

If you use a mixed case field name eg in addColumns(['mixedCase']) you will get an SQL error, but if you do addColumns('mixedcase') you get no error, but a column of data with Field Not Found as contents.

Changing this line in Rockfinder3.module.php seems to fix it:

        // add this column to columns array
        $colname = (string) strtolower($column);

I'm about to see if I find any other instances.

  • Thanks 1
Link to comment
Share on other sites

@bernhard Grateful for the offer to help, however on a tight schedule and managed to get the solution to my query using a normal PW selector with the very helpful and hard to find gem "repeater_field.owner.field_name'. It enabled me to get the repeater field, it's originating page and the individual repeater item data.

The selector returns a WireArray of repeater fields which are then manipulated into a flat array for RockTabulator.

From what I could see in the SQL statement from RF3, the field table aliases are getting mixed up when both the first and main RF3 queries have the same field name, eg 'title' - some have the unique id twice and then the LEFT JOIN 'title' gets the wrong one which then fails.

 

  • Like 1
Link to comment
Share on other sites

  • 3 weeks later...

v1.0.8 improves the readme (had no section about getRows() ? ) and adds a nice shortcut $finder->groupBy():

https://github.com/baumrock/rockfinder3#getting-data

 

Getting data

When using a regular $pages->find() you get a PageArray as result. When working with RockFinder we don't want to get the PageArray to be more efficient. We usually want plain PHP arrays that we can then use in our PHP code or that we can send to other libraries as data source (for example as rows for a table library).

getRows()

This returns an array of the result having the id as key for every array item:

$finder = $rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner']);
$rows = $finder->getRows();
db($rows);

img

Having the id as item key can be very handy and efficient to get one single array item via its id, eg db($rows[1071]):

img

getRowArray()

Sometimes having custom ids as array item keys is a drawback, though. For example tabulator needs a plain PHP array with auto-increment keys. In such cases you can use getRowArray():

img

https://github.com/baumrock/rockfinder3#predefined-methods

 

Predefined Methods

At the moment there is one shortcut using the string modification technique for grouping a result by one column:

$finder = $rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner']);
$cats_by_owner = $finder->groupBy('owner', [
    'GROUP_CONCAT(title) as title',
]);
db($cats_by_owner);

img

Another example could be getting averages:

$finder = $rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner', 'weight']);
$cat_weight_by_owner = $finder->groupBy('owner', [
    'AVG(weight) as weight',
]);
db($cat_weight_by_owner);

img

Of course you can combine both:

$finder = $rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner', 'weight']);
$combined = $finder->groupBy('owner', [
    'GROUP_CONCAT(title) as title',
    'AVG(weight) as weight',
]);
db($combined);

img

  • Like 3
Link to comment
Share on other sites

  • 1 month later...
On 8/7/2020 at 2:59 PM, bernhard said:

Hi @David Karich,

happy to hear that ? Did you do a transition from RF2 to RF3 or are you using it in parallel? Would you mind sharing some details about how/why you are using RF with us? ? 

On the one hand, I was more or less forced to switch to RF3 because RF2 no longer worked with the new PW-Master 3.0.164. Somehow the bindValues in the selectors were not processed correctly anymore, so that no PageIDs and paths were resolved correctly in the SQL statement. For this reason I have migrated completely to RF3. That was not a big problem either. In general, not much has changed from the public functions. In addition, it is also much more comfortable in RF3 to create and manage your own filters. Well done! I also wrote my own table aliases in the SQLs in RF2 quick and dirty, which RF3 now handles very cleanly itself. My application case has not changed. I continue to use RF3 to load masses of fieldset pages within repeater items, which contain lots of fields with design options for the repeater item. As already described in the RF2 thread below:

Quote

For example, I used it to load about 40 to 60 fields per repeater matrix type, which are also further nested in FieldsetPages (for design settings) in one query. This amount of fields caused 700 or more single SQL queries to be executed and up to 140 PW-Page objects to be loaded into RAM. This caused a significant loading delay. By using RockFinder2 the SQL queries could be reduced to 200 on average. About 60 page objects less are loaded in RAM, the CPU is also happy and the loading time for pages without cache was also reduced on average to 700 ms to 1 second. 

In this context, once again many thanks for this module and the further development! ?

  • Thanks 1
Link to comment
Share on other sites

18 minutes ago, David Karich said:

On the one hand, I was more or less forced to switch to RF3 because RF2 no longer worked with the new PW-Master 3.0.164.

Thx! Good to know that. This should be quite an easy fix (something like changing getQuery() to getDebugQuery())...

Link to comment
Share on other sites

By the way, here's a filter I often use to enrich the path for files or images directly with the corresponding pageID, so that you only have to prefix the file path. 

<?php namespace RockFinder3Column;
/**
 * Column type for multi-value fields like options, page reference, etc
 */
class FileWithID extends \RockFinder3\Column {
  public function applyTo($finder) {
    $finder->query->leftjoin("`{$this->table}` AS `{$this->tableAlias}` ON `{$this->tableAlias}`.`pages_id` = `pages`.`id`");
    $finder->query->select("GROUP_CONCAT(CONCAT(`{$this->tableAlias}`.`pages_id`, '/', `{$this->tableAlias}`.`data`) ORDER BY `{$this->tableAlias}`.`sort` SEPARATOR ',') AS `{$this->alias}`");
  }
}

This results in the string for example: "1234/myfile.jpg" and now just prepend with "$config->urls->files" in the output and you have the full path to the file. Maybe it helps somebody and is looking for exactly that. ? 

  • Like 1
Link to comment
Share on other sites

  • 2 months later...

Hi @bernhard,

thanks for the module, I'm trying to get my head around.

I have this :

$matches = $rockfinder->find($selector)->addColumns(['title','image','variante_prodotto.lunghezza'])->getRows(); // Doesn't work

'variante_prodotto' is a ProField of type table and has a subfield 'lunghezza'.
How could I retrieve the value of 'lunghezza' using your module? ?

Link to comment
Share on other sites

Hi @3fingers

I'm sorry, I have never ever used this field. But what you need should be quite simple to do using custom SQL: https://github.com/baumrock/rockfinder3#custom-sql

You could also create a custom column type: https://github.com/baumrock/rockfinder3#custom-column-types and I could add support for that if you send a PR.

Or I could implement that feature if you want to sponsor such an update.

Link to comment
Share on other sites

  • 3 weeks later...

Hi @bernhard,

We've been running up against the limitations of ProcessWire's native method of handling subfields in selectors. Basically, every subfield in a selector results in a separate query that runs to return all IDs that match the subfield, and then those IDs are used in the where of the final sql query (as opposed to joining the tables and then doing a where on the data all in a single query). This fails at scale because the individual subfield queries start returning so many page Ids that memory is exhausted or MySQL even crashes. There are ways to work around this in some cases by using separate selectors instead of subfields, but it's not terribly elegant. My question is, can we use RockFinder to solve this problem?

In this particular use case, we would still want to end up with Pages in the end. I am guessing that using subfields in a RockFinder selector would behave the same way as it does in native PW (I assume RockFinder just gets the final query that PW creates for it--which may have taken multiple queries to construct--and proceeds from there). But if the join function would allow us to limit results based on the values of subfields all in a single query, then that would be golden.

At that point all we would want back from RockFinder would be an array of page ids, which we could pass to Pages::getById() to get the pages back. Perhaps a method call for doing this is worth putting into RockFinder? This would be much more efficient than using the callback since it would be one SQL query to get all the pages.

Another thing which I mentioned back with RockFinder1 (and implemented locally) is the option of getting a WireArray of WireData objects instead of standard objects or a plain array. This is still way faster than Pages but allows you to use runtime selectors on the results.

Anyway, thanks for your work on this and looking forward to your thoughts.

Link to comment
Share on other sites

Hi @thetuningspoon

22 hours ago, thetuningspoon said:

My question is, can we use RockFinder to solve this problem?

I think so, but don't know for sure. It's definitely worth a try ? This is a sample sql that RF3 generates:

https://github.com/baumrock/rockfinder3#dumping-the-sql-of-the-finder

22 hours ago, thetuningspoon said:

In this particular use case, we would still want to end up with Pages in the end.

Why? Do you need to query lots of pages and fields and then return a subset of that?

22 hours ago, thetuningspoon said:

At that point all we would want back from RockFinder would be an array of page ids, which we could pass to Pages::getById() to get the pages back. Perhaps a method call for doing this is worth putting into RockFinder? This would be much more efficient than using the callback since it would be one SQL query to get all the pages.

Yeah, why not. Just try RF3 and add such a method as PR ? https://github.com/baumrock/RockFinder3/blob/40da7c5f087d87bc33d883add5b78cf7c546bacd/RockFinder3.module.php#L356

22 hours ago, thetuningspoon said:

Another thing which I mentioned back with RockFinder1 (and implemented locally) is the option of getting a WireArray of WireData objects instead of standard objects or a plain array. This is still way faster than Pages but allows you to use runtime selectors on the results.

Sounds like another great PR ? 

I don't want do sound like "do that yourself"... I'm happy to implement anything you need. But I don't have the time and energy to do that for free atm.

Hope that helps ? 

  • Thanks 1
Link to comment
Share on other sites

  • 4 weeks later...

@bernhard Really great module!!

I have around 1 mio pages, and it is pretty fast as long as I don't do comparisons. With more complex finders it is really slow on my end. I need to do a radius search, comparing lat/long fields to min max:

<? $finder = $rockfinder
  ->find("template=physician, physician_profile_geo_lat>=$roundMinLat, physician_profile_geo_lat<=$roundMaxLat, physician_profile_geo_long>=$roundMaxLon, physician_profile_geo_long<=$roundMinLon, limit=12")
  ->addColumns(['title', 'created']);
$rows = $finder->getRows();?>

This is really realy slow - any tips on optimizing this? I switched to RockFinder, hoping that it would increase speed, as I reached the limits of usability with the usual $pages->find with this, too.

If anyone has ideas or tips, I'd appreciate it! 

Link to comment
Share on other sites

Hi @Erik Richter

First, I'd try to find out where the slowdown comes from: ProcessWire or RockFinder? Maybe you can dump the created SQL statement and test it manually. https://github.com/baumrock/RockFinder3#dumping-data

It's just a guess, but maybe it gets faster if you do a really simple query on the PW selector part and then refine the result using custom SQL? https://github.com/baumrock/RockFinder3#custom-sql

Link to comment
Share on other sites

@bernhard Thanks for the quick response!

Would I add the regular field names in the selector?

<? $finder = $rockfinder
  ->find("template=physician, limit=12")
  ->addColumns(['title', 'created']);
$finder->query->where("physician_profile_geo_lat >= $roundMinLat");
...
$finder->query->where("physician_profile_geo_lat >= $roundMinLat");
$rows = $finder->getRows();?>

Thank you!!

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...