Hello all,
I was running into an issue where a particular repeater matrix structure failed to index properly.
In a structure like this:
Template
Repeater Matrix Field
Matrix Type
Page Field
Page
Id
Name
Title
Headline
Body
Summary
So where you have a page template with 'zones' that allow you to place a matrix type that acts like a block with a page selection field in it to insert a piece of content, that deeper piece of content was not getting properly indexed no matter what indexed fields you set in place.
In SearchEngine/lib/Index.php within the function __getFieldIndex there is a section that overrides the $indexed_fields array with a hard-coded array. In the version below I have commented it out and restored the $indexed_fields array and the referenced child page content for Repeater Matrix Types now indexes properly:
/**
* Get index for a single field
*
* @param \ProcessWire\Field $field
* @param \ProcessWire\WireData $object
* @param array $indexed_fields
* @param string $prefix
* @param array $args
* @return array
*/
protected function ___getFieldIndex(\ProcessWire\Field $field, \ProcessWire\WireData $object, array $indexed_fields = [], string $prefix = '', array $args = []): array {
$index = [];
if ($this->isRepeatableField($field)) {
$index = $this->getRepeatableIndexValue($object, $field, $indexed_fields, $prefix);
} else if ($field->type->className() == 'FieldtypeFieldsetPage') {
$index = $this->getPageIndex(
$this->getUnformattedFieldValue($object, $field->name),
$indexed_fields,
$prefix . $field->name . '.',
$args
);
} else if ($field->type instanceof \ProcessWire\FieldtypePage) {
// Note: unlike with FieldtypeFieldsetPage above, here we want to check for both FieldtypePage
// AND any class that might potentially extend it, which is why we're using instanceof.
/**
$index = $this->getPageReferenceIndexValue($object, $field, [
'id',
'name',
'title',
], $prefix);
*/
$index = $this->getPageReferenceIndexValue($object, $field, $indexed_fields, $prefix);
} else {
$index_value = $this->getIndexValue($object, $field, $indexed_fields);
$index[$prefix . $field->name] = $index_value->getValue();
foreach ($index_value->getMeta(true) as $meta_key => $meta_value) {
$meta_value = explode(':', $meta_value);
$index[self::META_PREFIX . $meta_key . '.' . $field->name . '.' . array_shift($meta_value) . ':'] = implode(':', $meta_value);
}
}
return array_filter($index);
}
I don't know if this was set in place for performance reasons, but if you are using a page template structure with page fields you may have run into this problem where in some cases only the content titles is included in the search_index output..
This change does extend the indexing timeframe, but pages are indexed fully.