RockFinder3 by bernhard

Combine the power of ProcessWire selectors and SQL.

img

Combine the power of ProcessWire selectors and SQL


img

Preface

Why this module exists


Initially RockFinder1 was built to feed client side datatables with an array of ProcessWire page data. Loading all pages into memory via a $pages->find() query can quickly get inefficient. Querying the database directly via SQL can quickly get very complex on the other hand.

RockFinder is here to help you in such situations and makes finding (or aggregating) data stored in your ProcessWire installation easy, efficient and fun.

Possible use cases:

  • Find data for any kind of tabular data (tabulator.info, datatables.net, ag-grid.com).
  • Reduce the amount of necessary SQL queries (see here).
  • Find data for a CSV or XML export.
  • Find data for a REST-API.

Differences to previous RockFinder modules


Getting help / Contribute


Example Snippets


TracyDebugger is not necessary for using RockFinder3 but it is recommended. All examples in this readme show dumps of RockFinder instances using the Tracy Console.

Special thanks to Adrian once more for the brilliant TracyDebugger and the quick help for adding dumping support to the Tracy Console! This was tremendously helpful for developing this module and also for writing these docs.

img

Basic Concept

The concept of RockFinder is to get the base query of the $pages->find() call and modify it for our needs so that we get the best of both worlds: Easy PW selectors and powerful and efficient SQL operations.

In PW every find operation is turned into a DatabaseQuerySelect object. This class is great for working with SQL via PHP because you can easily modify the query at any time without complex string concatenation operations:

img

This is the magic behind RockFinder3! It provides an easy to use API to modify that base query and then fires one efficient SQL query and gets an array of stdClass objects as result.

img

Installation

Install the RockFinder3Master module. The master module is an autoload module that adds a new variable $rockfinder to the PW API and also installs the RockFinder3 module that is responsible for all the finding stuff.

img

Usage

In the most basic setup the only thing you need to provide to a RockFinder is a regular PW selector via the find() method:

// either via the API variable
$rockfinder->find("template=foo");

// or via a modules call
$modules->get('RockFinder3')->find("template=foo");

img

Adding columns

You'll most likely don't need only ids, so there is the addColumns() method for adding additional columns:

$rockfinder
  ->find("template=admin, limit=3")
  ->addColumns(['title', 'created']);

img

This makes it possible to easily add any field data of the requested page. If you want to add all fields of a template to the finder there is a shortcut:

$rockfinder
  ->find("template=foo")
  ->addColumnsFromTemplate("foo");

img

Getting data

When using a regular $pages->find() you get a PageArray as result. When working with RockFinder we don't want to get the PageArray to be more efficient. We usually want plain PHP arrays that we can then use in our PHP code or that we can send to other libraries as data source (for example as rows for a table library).

getRows()


This returns an array of the result having the id as key for every array item:

$finder = $rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner']);
$rows = $finder->getRows();
db($rows);

img

Having the id as item key can be very handy and efficient to get one single array item via its id, eg db($rows[1071]):

img

getRowArray()


Sometimes having custom ids as array item keys is a drawback, though. For example tabulator needs a plain PHP array with auto-increment keys. In such cases you can use getRowArray():

img

img

Dumping data

For small finders Tracy's dump() feature is enough, but if you have more complex finders or you have thousands of pages this might get really inconvenient. That's why RockFinder3 ships with a custom dump() method that works in the tracy console and turns the result of the finder into a paginated table (using tabulator.info).

For all the dumping methods you can provide two parameters:

  1. The title of the dump*
  2. The settings for the rendered tabulator table

*Note that if you set the title to TRUE the method will not only dump the tabulator but also the current RockFinder3 object (see the next example).

dump() or d()


$rockfinder
  ->find("id>0")
  ->addColumns(['title', 'created'])
  ->dump(true);

img

barDump() or bd()


For situations where you are not working in the console but maybe in a template file or a module the barDump() method might be useful.

img

Dumping the SQL of the finder


To understand what is going on it is important to know the SQL that is executed. You can easily dump the SQL query via the dumpSQL() or barDumpSQL() methods. This even supports chaining:

$rockfinder
  ->find("template=cat")
  ->addColumns(['title'])
  ->dumpSQL()
  ->addColumns(['owner'])
  ->dumpSQL()
  ->dump();

img

img

Renaming columns (column aliases)

Sometimes you have complicated fieldnames like my_great_module_field_foo and you just want to get the values of this field as column foo in your result:

$rockfinder
  ->find("template=person")
  ->addColumns(['title' => 'Name', 'age' => 'Age in years', 'weight' => 'KG'])
  ->dump();

img

img

Custom column types

You can add custom column types easily. Just place them in a folder and tell RockFinder to scan this directory for columnTypes:

// do this on the master module!
$modules->get('RockFinder3Master')->loadColumnTypes('/your/directory/');

See the existing columnTypes as learning examples.

img

Working with options fields

By default RockFinder will query the data column in the DB for each requested field. That's fine for lots of fields (like Text or Textarea fields), but for more complex fields this will often just return an ID value instead of the value that we would like to see (like a file name, an option value, etc):

$rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'sex'])
  ->dump();

img

Option 1: OptionsValue and OptionsTitle columnTypes

In case of the Options Fieldtype we have a title and a value entry for each option. That's why RockFinder ships with two custom columnTypes that query those values directly from the DB (thanks to a PR from David Karich @RockFinder2). You can even get both values in one single query:

$rockfinder
  ->find("template=cat")
  ->addColumns([
    'title',
    'sex' => 'sex_id',
    'OptionsValue:sex' => 'sex_value',
    'OptionsTitle:sex' => 'sex_title',
  ])
  ->dump();

img

Note that the column aliases are necessary here to prevent duplicate columns with the same name!

Option 2: Options relation

Option 1 is very handy but also comes with a drawback: It loads all values and all titles into the returned resultset. In the example above this means we'd have around 50x m, 50x f, 50x Male and 50x female on 100 rows. Multiply that by the number of rows in your resultset and you get a lot of unnecessary data!

Option 2 lets you save options data in the finder's getData()->option property so that you can then work with it at runtime (like via JS in a grid that only renders a subset of the result):

$rockfinder
  ->find("template=cat")
  ->addColumns([
    'title',
    'sex',
  ])
  ->addOptions('sex');

img

$finder->options->sex[2]->value; // f
$finder->options->sex[2]->title; // Female

You can also use the helper functions:

$finder->getOptions('sex');
$finder->getOption('sex', 2);

img

img

Multi-Language

Usually data of a field is stored in the data db column of the field. On a multi-language setup though, the data is stored in the column for the user's current language, eg data123. This makes the queries more complex, because you need to fallback to the default language if the current language's column has no value. RockFinder3 does all that for you behind the scenes and does just return the column value in the users language:

$user->language = $languages->get(1245);
$rockfinder
  ->find("template=cat")
  ->addColumns([
    'title',
    'sex',
  ])
  ->dump();

img

Even setting up new columnTypes is easy! Just use the built in select property of the column and it will return the correct SQL query for you:

class Text extends \RockFinder3\Column {
  public function applyTo($finder) {
    $finder->query->leftjoin("`{$this->table}` AS `{$this->tableAlias}` ON `{$this->tableAlias}`.`pages_id` = `pages`.`id`");
    $finder->query->select("{$this->select} AS `{$this->alias}`");
  }
}

This will use these values behind the scenes (here for the title field):

img

img

Aggregations

The simplest aggregation is the count() method:

$count = $rockfinder->find(...)->count();

Often we need to calculate sums or averages of table data quickly and efficiently. RockFinder3 makes that easy as well:

$avg = $rockfinder->find("template=cat")
    ->addColumn('weight')
    ->getObject("SELECT AVG(weight)");
db($avg);

img

$cats = $rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'weight']);
$cats->dump(); // dump finder data to tracy
$obj = $cats->getObject("SELECT SUM(`weight`) AS `total`, COUNT(`id`) AS `cats`, SUM(`weight`)/COUNT(`id`) AS `avg`");
db($obj); // dump result of aggregation

img

What happens behind the scenes is that RockFinder3 gets the current SQL query of the finder and adds that as FROM (...sql...) AS tmp to the query that you provide for aggregation.

This is the resulting SQL query of the example above:

SELECT SUM(`weight`) AS `total`, COUNT(`id`) AS `cats`, SUM(`weight`)/COUNT(`id`) AS `avg` FROM (
  SELECT
    `pages`.`id` AS `id`,
    `_field_title_605cab16f38ce`.`data` AS `title`,
    `_field_weight_605cab16f3993`.`data` AS `weight`
  FROM `pages`
  LEFT JOIN `field_title` AS `_field_title_605cab16f38ce` ON `_field_title_605cab16f38ce`.`pages_id` = `pages`.`id`
  LEFT JOIN `field_weight` AS `_field_weight_605cab16f3993` ON `_field_weight_605cab16f3993`.`pages_id` = `pages`.`id`
  WHERE (pages.templates_id=44)
  AND (pages.status<1024)
  GROUP BY pages.id
) AS tmp

You can even provide a suffix for your query to do things like GROUP BY etc:

$rf = $rockfinder->find("template=cat|dog");
$rf->addColumns(['created']);
$rf->dump();
db($rf->getObjects(
  "SELECT COUNT(id) AS `count`, DATE_FORMAT(created, '%Y-%m-%d') AS `date`",
  "GROUP BY DATE_FORMAT(created, '%Y-%m-%d')"
));

img

Callbacks

RockFinder3 supports row callbacks that are executed on each row of the result. Usage is simple:

each()


$rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'weight'])
  ->each(function($row) { $row->myTitle = "{$row->title} ({$row->weight} kg)"; })
  ->dump();

img

These callbacks can be a great option, but keep in mind that they can also be very resource intensive! That applies even more when you request page objects from within your callback (meaning there will be no benefit at all in using RockFinder compared to a regular $pages->find() call).

addPath()


A special implementation of the each() method is the addPath() method that will add a path column to your result showing the path of every page. This will not load all pages into memory though, because it uses the $pages->getPath() method internally.

$rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'weight'])
  ->addPath("de")
  ->dump();

img

If you need the path for linking/redirecting from your data to the pages it might be better to build a custom redirect page that works with the page id, so you don't need the overhead of getting all page paths:

<a href='/your/redirect/url/?id=123'>Open Page 123</a>

If you really need to access page objects you can get them via the $finder parameter of the callback:

$finder->each(function($row, $finder) {
  $row->foo = $finder->pages->get($row->id)->foo;
}

map(column, labels)


Sometimes the value stored in the DB is a key that is not as descriptive as a text label. For example it could be that the returned value of a "sex" column is f for female or m for male. One option is to replace this value via JS which could have benefits regarding performance on large datasets, but it is a little more work to make everything work as expected (eg you need to take care of formatting of the cell as well as the sort and filter values).

It might be easier to map the returned value of the finder to an array of labels in such cases:

$sexes = ['f'=>'Female', 'm'=>'Male'];
$rockfinder->find(...)
  ->addColumns(["sex", ...])
  ->map("sex", $sexes);

The column will then show male and female instead of m and f.

img

Joins

What if we had a template cat that holds data of the cat, but also references one single owner. And what if we wanted to get a list of all cats including their owners names and age? The owner would be a single page reference field, so the result of this column would be the page id of the owner:

$rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner'])
  ->dump();

img

Joins to the rescue:

$owners = $rockfinder
  ->find("template=person")
  ->addColumns(['title', 'age'])
  ->setName('owner'); // set name of target column
$rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner'])
  ->join($owners)
  ->dump();

img

If you don't want to join all columns you can define an array of column names to join. You can also set the removeID option to true if you want to remove the column holding the id of the joined data:

->join($owners, ['columns' => ['title'], 'removeID' => true])

img

Joins work great on single page reference fields. But what if we had multiple pages referenced in one single page reference field?

img

Relations

Let's take a simple example where we have a page reference field on template cat that lets us choose kittens for this cat:

img

This is what happens if we query the field in our finder:

$rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'kittens'])
  ->dump();

img

So, how do we get data of those referenced pages? We might want to list the name of the kitten (the title field). This could be done in a similar way as we did on the options field above. But what if we also wanted to show other field data of that kitten, like the sex and age? It would get really difficult to show all that informations in one single cell of output!

Relations to the rescue:

// setup kittens finder that can later be added as relation
$kittens = $rockfinder
  ->find("template=kitten")
  ->setName("kittens")
  ->addColumns(['title', 'OptionsTitle:sex', 'age']);

// setup main finder that finds cats
$finder = $rockfinder
  ->find("template=cat,limit=1")
  ->setName("cats")
  ->addColumns(['title', 'kittens'])
  ->addRelation($kittens);

// dump objects
db($finder);
db($finder->relations->first());

img

NOTE Look at the result of the kittens finder: It returned three rows as result even though we did not define any limit on the initial setup of that finder! That is because RockFinder will automatically return only the rows of the relation that are listed in the column of the main finder!

You can see what happens in the SQL query:

$finder->relations->first()->dumpSQL();
SELECT
  `pages`.`id` AS `id`,
  `_field_title_5eca947b3da27`.`data` AS `title`
FROM `pages`
LEFT JOIN `field_title` AS `_field_title_5eca947b3da27`
  ON `_field_title_5eca947b3da27`.`pages_id` = `pages`.`id`
WHERE (pages.templates_id=51)
AND (pages.status<1024)
AND pages.id IN (258138,258171,258137) /* here is the limit */
GROUP BY pages.id

If you need to access those kittens 258138,258171,258137 via PHP you can do this:

$relation = $finder->relations->first();
db($relation->getRowsById("258138,258171,258137"));
db($relation->getRowById(258138));

img

There's a lot you can do already simply using the RockFinder API, but I promised something about using SQL...

img

Custom SQL

Option 1: DatabaseQuerySelect


RockFinder3 is heavily based on the DatabaseQuerySelect class of ProcessWire. This is an awesome class for building all kinds of SQL SELECT statements - from simple to very complex ones. You can access this query object at any time via the query property of the finder:

$owners = $rockfinder
  ->find("template=person")
  ->addColumns(['title', 'age', 'weight']);
db($owners->query);

img

This means you have full control over your executed SQL command:

$finder = $rockfinder->find(...)->addColumns(...);
$finder->query->select("foo AS foo");
$finder->query->select("bar AS bar");
$finder->query->where("this = that");

The only thing you need to take care of is to query the correct tables and columns. This might seem a little hard because many times the names are made unique by a temporary suffix. It's very easy to access these values though:

$owners = $rockfinder
  ->find("template=person")
  ->setName('owner')
  ->addColumns(['title', 'age']);
db($owners->columns->get('age'));

img

Example: Group by date

As an example we will create a list of the count of cats and dogs related to their page-creation day. We start with a simple list of all ids of cats and dogs.

Spoiler: There is a shortcut method groupby() described in the section about Predefined methods below ;)

Also note that since 03/2021 there is another option to get aggregated data from a finder result. In my opinion it is a little easier to read and to learn. See section Aggregations above!

$rf = $rockfinder->find("template=cat|dog");
$rf->dumpSQL();
$rf->dump();

img

We can't simply add the created column because this is a timestamp. We need a formatted date, so we add it as custom SQL:

$rf = $rockfinder->find("template=cat|dog");
$rf->query->select("DATE_FORMAT(pages.created, '%Y-%m-%d') as created");
$rf->dumpSQL();
$rf->dump();

img

Great! Now we need to group the result by the date string:

$rf = $rockfinder->find("template=cat|dog");
$rf->query->select("DATE_FORMAT(pages.created, '%Y-%m-%d') as created");
$rf->query->select("COUNT(id) as cnt");
$rf->query->groupby("DATE_FORMAT(pages.created, '%Y-%m-%d')");
$rf->dumpSQL();
$rf->dump();

img

Wait... That's not what we expected, right? That's because we still have the pages.id column in our SELECT and GROUP BY statement and therefore we end up with all the cats and dogs as unique rows. To get rid of that column we make one important change: Instead of adding the SELECT and GROUP BY statement to the query we overwrite them:

$rf = $rockfinder->find("template=cat|dog");
$rf->query->set('select', [
    "DATE_FORMAT(pages.created, '%Y-%m-%d') as created",
    "COUNT(id) as cnt",
]);
$rf->query->set('groupby', [
    "DATE_FORMAT(pages.created, '%Y-%m-%d')",
]);
$rf->dumpSQL();
$rf->dump();

img

Not too complicated, right? You want yearly stats? Easy! Simply change the date format string to %Y:

img

Option 2: SQL String Modification


Another technique is to get the resulting SQL and wrap it around a custom SQL query:

$owners = $rockfinder
  ->find("template=person")
  ->addColumns(['title', 'age', 'weight'])
  ->setName('owner');
$cats = $rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner'])
  ->join($owners)
  ->getSQL();

Now we have the SQL statement in the $cats variable. To get the average age of all owners:

$sql = "SELECT AVG(`owner:age`) FROM ($cats) AS tmp";
db($rockfinder->getObject($sql));

img

Get average age of all owners older than 50 years:

$sql = "SELECT
  AVG(`owner:age`) AS `age`,
  `owner:weight` as `weight`
  FROM ($cats) AS tmp
  WHERE `owner:age`>50
";
db($rockfinder->getObject($sql));

img

Your SQL skills are the limit!

Predefined Methods

At the moment there is one shortcut using the string modification technique for grouping a result by one column:

$finder = $rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner']);
$cats_by_owner = $finder->groupBy('owner', [
  'GROUP_CONCAT(title) as title',
]);
db($cats_by_owner);

img

Another example could be getting averages:

$finder = $rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner', 'weight']);
$cat_weight_by_owner = $finder->groupBy('owner', [
  'AVG(weight) as weight',
]);
db($cat_weight_by_owner);

img

Of course you can combine both:

$finder = $rockfinder
  ->find("template=cat")
  ->addColumns(['title', 'owner', 'weight']);
$combined = $finder->groupBy('owner', [
  'GROUP_CONCAT(title) as title',
  'AVG(weight) as weight',
]);
db($combined);

img

If you need the SQL statement instead of a PHP array you can set that as one of several options:

$sql = $finder->groupby('foo', [...], ['sql'=>true]);

For example you can then JOIN complex queries quite easily:

$foo = $rockfinder->find(...);
$foosql = $foo->getSQL();

$bar = $rockfinder->find(...);
$barsql = $bar->groupby('bar', [...], ['sql'=>true]);

$join = "SELECT
  foo.xx, foo.yy, bar.xx, bar.yy
  FROM ($foosql) AS foo
  LEFT JOIN ($barsql) AS bar ON foo.xx = bar.yy";
db($rockfinder->getObjects($sql));

img

Thank you

...for reading thus far and for using RockFinder3. If you find RockFinder3 helpful consider giving it a star on github or saying thank you. I'm also always happy to get feedback in the PW forum!

Happy finding :)

Install and use modules at your own risk. Always have a site and database backup before installing new modules.

Twitter updates

  • ProcessWire 3.0.185 (dev) core updates, plus new Session Allow module— More
    17 September 2021
  • Three new ProcessWire Textformatter modules: Find/Replace, Markdown in Markup, and Emoji— More
    3 September 2021
  • This week we have a new master version released after a year in the making. With nearly 40 pull requests, hundreds of new additions and more than 100 issue reports resolved, this new version has a ton of great new stuff— More
    27 August 2021

Latest news

  • ProcessWire Weekly #384
    In the 384th issue of ProcessWire Weekly we'll cover the latest core updates, introduce a new module called Session Allow, and highlight a new site of the week. Read on!
    Weekly.pw / 18 September 2021
  • ProcessWire 3.0.184 new master/main version
    This week we have a new master/main version released after a full year in the making. As you might imagine, this new version has a ton of great new stuff and we’ll try to cover much of it here.
    Blog / 27 August 2021
  • Subscribe to weekly ProcessWire news

“We chose ProcessWire because of its excellent architecture, modular extensibility and the internal API. The CMS offers the necessary flexibility and performance for such a complex website like superbude.de. ProcessWire offers options that are only available for larger systems, such as Drupal, and allows a much slimmer development process.” —xport communication GmbH