Jump to content

JsonNativeField: key-value table + native JSON = fully searchable


BitPoet

Recommended Posts

Another proof-of-concept module:

JsonNativeField

Leverages MySQL >= 5.7.8's native JSON column type for key-value storage.

This gives us the opportunity to store arbitrary (textual) fields in the database and allows us to search for them with plain subfield selector syntax, including wildcard operators. Possible applications are storing submitted form data or adding user-defined properties to their profile.

Currently, the module is still really, really alpha, but I wanted to get input early on.  Let me know what you think.

Download from GitHub.

Here are a few screenshots for anybody interested but unable to try for themselves:

1. First page with json field and custom subfields:

jsonfield1-page1.png

 

2. Second page with json field and custom subfields:

jsonfield2-page2.png

 

3. Searching through all subfields in the json field:

jsonfield3-search-all-subfields.png

 

4. Searching only in a specific subfield:

jsonfield4-search-specific-subfield.png

 

The interface is really rather crude still. The "delete" buttons only work after saving (have to attach listeners to the newly created entries yet), and I've got to straighten out supported operators for Lister to pick up. I'll see if I find some time tomorrow to work on these issues and brush up the visual side a bit.

Edited by BitPoet
Added screenshots
  • Like 19
Link to comment
Share on other sites

Just a quick status update: I've started implementing different field types (text, integer, float, date, date+time) with native inputfields. Types can be switched back and forth, and I'll make it hookable so custom inputfields can be added as well. Will be a while though, since it looks like I'll be busy with a lot of other stuff until the end of next week. Here's a teaser though...

jsonfield5-types.png

  • Like 8
Link to comment
Share on other sites

Unexpectedly found a half day to spare, so I've pushed 0.0.3 to GitHub.

New features:

  • Field settings for custom date, date+time and text fields can now be made on the inputfield's input tab, e.g. date and time formats, placeholder text or text validatoin patterns (numeric field settings are on the todo list)
  • Names for custom fields are limited to start and end with ASCII letters and only have alphanumeric ASCII characters and hyphens in between
  • Custom field name of "data" is silently stripped to avoid bollixing up all entries
  • The custom fields table in the backend is now sorted by name (new entries simply get appended)

The module is still alpha. Things are getting smoother, but there are still a few items on the todo list. :)

  • Like 10
Link to comment
Share on other sites

@BitPoet

Wow, this is perfect timing, as I am just working on the requirements for a project that requires storing a lot of JSON.

Does this enable storage and retrieval of nested JSON content that is multiple layers deep? E.g. $pages->find('jsonField.subField.field=foo') ?

Obviously this would present some challenges for displaying the data on the back end using your current tabular format. Maybe a nested list (like ProcessWire's page tree) would be a more scalable approach?

Link to comment
Share on other sites

I admit I haven't thought too much about nesting, and the wiring around the field's json data would obviously need to be completely different. InnoDB does support searching in nested structures, but PW already treats subfields on the left side of a selector in its own way and throws away anything starting at the second full stop. A different syntax would be necessary, e.g.

$pages->find('jsonField=/subField/field:foo') 

There might be issues where search expressions need some kind of escaping (e.g. searching for literal text "/subfield"), but this would allow for wildcard queries.

$pages->find('jsonField=/subField/*/name:foo') 

Or even searching arrays (any element or by index):

$pages->find('jsonField=/subField/fields#*/name:foo');

$pages->find('jsonField=/subField/field#12/name:foo');

Here's a tiny adoption of my module to allow this search syntax. There's no back and forth conversion though, just a textarea containing the JSON.

<?php

class FieldtypeJsonDocument extends FieldtypeTextarea implements Module {
	public static function getModuleInfo() {
		return array(
			"title"				=>	"Fieldtype Json Document",
			"summary"			=>	"Fieldtype utilizing native MySQL support for searching JSON documents.",
			"version"			=>	"0.0.3",
		);
	}
	
	public function getMatchQuery($query, $table, $subfield, $operator, $value) {
		$database = $this->wire("database");

		list($path, $value) = explode(':', $value, 2);
		if(empty($value) && !empty($path)) {
			$value = $path;
			$path = "";
		}
		$path = '$' . ((empty($subfield) || $subfield == "data") ? "" : ".$subfield") . (empty($path) ? ".*" : str_replace('/', '.', preg_replace('~/?#(\d+|\*)~', "[$1]", $path)));

		$table = $database->escapeTable($table);
		$value = $database->escapeStr($value);
		if($operator == "=") {
			$query->where("JSON_SEARCH({$table}.data, 'one', '$value', NULL, '$path') IS NOT NULL");			
		} else if($operator == "*=" || $operator == "%=") {
			$query->where("JSON_SEARCH({$table}.data, 'one', '%$value%', NULL, '$path') IS NOT NULL");			
		} else if($operator == "^=") {
			$query->where("JSON_SEARCH({$table}.data, 'one', '$value%', NULL, '$path') IS NOT NULL");			
		} else if($operator == "$=") {
			$query->where("JSON_SEARCH({$table}.data, 'one', '%$value', NULL, '$path') IS NOT NULL");			
		}

		$this->log->message($query->getQuery());

		return $query; 
	}

	
	public function getDatabaseSchema(Field $field) {
		$engine = $this->wire('config')->dbEngine; 
		$charset = $this->wire('config')->dbCharset;
		$schema = array(
			'pages_id' => 'int UNSIGNED NOT NULL', 
			'data' => "JSON", // each Fieldtype should override this in particular
			'keys' => array(
				'primary' => 'PRIMARY KEY (`pages_id`)',
			),
			// additional data 
			'xtra' => array(
				// any optional statements that should follow after the closing paren (i.e. engine, default charset, etc)
				'append' => "ENGINE=$engine DEFAULT CHARSET=$charset", 
				
				// true (default) if this schema provides all storage for this fieldtype.
				// false if other storage is involved with this fieldtype, beyond this schema (like repeaters, PageTable, etc.)
				'all' => true, 
			)
		); 
		return $schema; 
	}

    public function getInputfield(Page $page, Field $field) {
        $inputField = $this->modules->get('InputfieldTextarea');
        return $inputField;
    }
	
	public function install() {
		if($this->config->dbEngine != "InnoDB") {
			throw new WireException($this->_("InnoDB database engine needs to be used for native JSON support"));
		}
		
		$dbver = $this->database->getAttribute(PDO::ATTR_SERVER_VERSION);
		if(version_compare($dbver, '5.7.8', '<')) {
			throw new WireException(sprintf($this->_("MySQL Server version needs to be at least 5.7.8 for fully working JSON support, installed version is %s"), $dbver));
		}
	}
}

There would probably be a bit of recursive shuffling necessary in sleep-/wakeup-/sanitizeValue to build nested structures based on WireData/WireArray so accessing subfields and filtering PageArrays is possible, but that should be doable too. How difficult the backend input + display parts get depends solely on your exact requirements.

  • Like 4
Link to comment
Share on other sites

Status update: JsonNativeField version 0.0.4 now has configuration options for all field types in the field's input settings. Coming up next is some more data validation (i.e. "don't shoot your foot" checking) both in the UI when switching input type and in the module code when saving to and loading from the db. This will be the major step towards moving the module from alpha to beta stage.

  • Like 6
Link to comment
Share on other sites

  • 1 year later...

It's a bit on the back burner right now. I was hoping for some more elaborate possibilities with MySQL 8, but the changes to JSON support there were on the homeopathic end rather than the dynamic typing support I had hoped for. I plan to brush up the UI and client side validation a bit once PW 3.1 gets out, though, and test things out in depth with UIkit admin theme.

  • Like 6
Link to comment
Share on other sites

  • 2 years later...

I've been searching for a viable means to store json without having to use a document db as well. Is this still on the back burner? If not, any suggestions on efficiently searching large datasets?
Thanks!

Link to comment
Share on other sites

Maybe totally OT but I had to store some really large JSON objects and therefore stored them straight in pages. Maybe not ideal but even after 12,000 entries/pages the system responds super fast.

If I remember correctly I used a textarea field and never had any issues.

Not perfect but... well... it works.

  • Like 1
Link to comment
Share on other sites

I'm having to read the whole dataset into an array in order to process entries. Some pages have 5K+ entries, others are more manageable. It's a global Company <--> Contractor relationship.

Link to comment
Share on other sites

  • 1 year later...

Good day, @BitPoet!

I am looking at this field to implement custom properties for pages without a predefined schema, but so they can still be searchable by them. I can't think of a better solution, than this field. It really looks awesome!

I need to fill those properties via API during import. I haven's spotted field API docs neither here nor on github. Did I miss them? If there aren't any, could you please share an example of reading from/writing to this field?

Link to comment
Share on other sites

  • 2 weeks later...
On 8/6/2021 at 3:38 PM, Ivan Gretsky said:

Good day, @BitPoet!

I am looking at this field to implement custom properties for pages without a predefined schema, but so they can still be searchable by them. I can't think of a better solution, than this field. It really looks awesome!

I need to fill those properties via API during import. I haven's spotted field API docs neither here nor on github. Did I miss them? If there aren't any, could you please share an example of reading from/writing to this field?

Hi Ivan,

I never really looked at API usage up to now (and once I did, I noticed that the module really should be namespaced).

Here's a short example for filling values in code:

<?php namespace ProcessWire;

/*
	Fill the JsonNativeField field "nativefield" in Page $p through API
*/

$fdata = new \JsonData();		// This is the class for a JsonNativeFieldValue
$fdata->set('first-row', new \JsonValue('text', 'Text in first row'));
$fdata->set('second-row', new \JsonValue('integer', 123));
$fdata->set('third-row', new \JsonValue('float', 3.1415927));
$fdata->set('fourth-row', new \JsonValue('date', time());
$fdata->set('fifth-row', new \JsonValue('datetime', time() + 3600));

$p->of(false);
$p->nativefield = $fdata;
$p->save();
            
/* Change value for "fifth-row" in our field */
$p>-nativeField->set('fifth-row', new \JsonValue('datetime', time() - 3600));
$p->save();

 

 

  • Like 3
Link to comment
Share on other sites

Posted (edited)
19 hours ago, Ivan Gretsky said:

While you're at it, maybe you'll have time to look at this issue? Solving it could make using this field possible for existing installations and therefore more common. I think this module deserves more attention and usage.

I've changed that so it only checks for minimum MySQL Version. Still, the module is very much alpha code and work-in-progress. I haven't gotten around to apply and test the field type configuration options, and there are quite a few rough spots in the existing code.

FYI, I decided to rename my helper classes as I realized that JsonData and JsonValue may make nice names for core classes. To avoid clashing and hogging names I've prepended "JN", so it's JNJsonData and JNJsonValue now.

The new release is still in testing stage and can be found here.

Edited by BitPoet
Added link to dev tree in github repo
  • Like 3
Link to comment
Share on other sites

1. Thanks, @BitPoet! I did install the dev version in a MyIsam db. It works!

2. This doesn't seem to work. I get an error Call to a member function set() on string. Am I doing something wrong?

/* Change value for "fifth-row" in our field */
$p>-nativeField->set('fifth-row', new \JsonValue('datetime', time() - 3600));
$p->save();

3. You gave example of how to set new values. Can you help with getting the value of a specified key and outputting it.

4. ...and adding new key/value fields to an existing set.

5. What are field type configuration options? Are you talking about setting a new custom type (for example, not just text, but email / phone) with its own customizable pattern checks and so on? That was exactly what I was about to propose. The other cool thing would be to use custom inputfields for types. Like Page Reference to get a comma separated pages list and store it as a string.

6. Another idea is the ability to limit the names and/or name/type combinations to a predefined set, so the user can only choose from the list, but not create new key/value pairs.

I do understand, that this module is probably not your top priority, as it was here unchanged for a while. But just can't resist writing here all my wild wishes)

Link to comment
Share on other sites

4 hours ago, Ivan Gretsky said:
$p>-nativeField->set('fifth-row', new \JsonValue('datetime', time() - 3600));

 

7 hours ago, BitPoet said:

FYI, I decided to rename my helper classes

 

7 hours ago, BitPoet said:

I've prepended "JN", so it's JNJsonData and JNJsonValue now.

 

Try:

$p>-nativeField->set('fifth-row', new \JNJsonValue('datetime', ti


 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Similar Content

    • By MarkE
      This fieldtype and inputfield bundle was built for storing measurement values within a field, rendering them in a variety of formats and converting them to other units or otherwise modifying them via the API.
      The API consists of a number of predefined functions, some of which include...
      render() for rendering the measurement object, valueAs() for converting the value to another unit value, convertTo() for converting the whole measurement object to different units, and add() and subtract() for for modifying the stored value by the value (converted as required) in another measurement. In the admin the inputfield includes a checkbox (which can be optionally disabled) for converting values on page save. For an example if a value was typed in as centimeters, the unit was changed to metres, and the page saved with this checkbox selected, said value would be automatically converted so that e.g. 170 cm becomes 1.7 m.

      A simple length field using Fieldtype Measurement and Inputfield Measurement.
      Combination units (e.g. feet and inches) are also supported.
      Please note that this module is 'proof of concept' at the moment - there are limited units available and quite a lot of code tidying to do. More units will be added shortly.
      See the GitHub at https://github.com/MetaTunes/FieldtypeMeasurement for full details and updates.
    • By tcnet
      File Manager for ProcessWire is a module to manager files and folders from the CMS backend. It supports creating, deleting, renaming, packing, unpacking, uploading, downloading and editing of files and folders. The integrated code editor ACE supports highlighting of all common programming languages.
      https://github.com/techcnet/ProcessFileManager

      Warning
      This module is probably the most powerful module. You might destroy your processwire installation if you don't exactly know what you doing. Be careful and use it at your own risk!
      ACE code editor
      This module uses ACE code editor available from: https://github.com/ajaxorg/ace

      Dragscroll
      This module uses the JavaScript dragscroll available from: http://github.com/asvd/dragscroll. Dragscroll adds the ability to drag the table horizontally with the mouse pointer.
      PHP File Manager
      This module uses a modified version of PHP File Manager available from: https://github.com/alexantr/filemanager
       
    • By tcnet
      This module implements the website live chat service from tawk.to. Actually the module doesn't have to do much. It just need to inserted a few lines of JavaScript just before the closing body tag </body> on each side. However, the module offers additional options to display the widget only on certain pages.
      Create an account
      Visit https://www.tawk.to and create an account. It's free! At some point you will reach a page where you can copy the required JavaScript-code.

      Open the module settings and paste the JavaScript-code into the field as shown below. Click "Submit" and that's all.

      Open the module settings
      The settings for this module are located int the menu Modules=>Configure=>LiveChatTawkTo.

       
    • By tcnet
      Session Viewer is a module for ProcessWire to list session files and display session data. This module is helpful to display the session data of a specific session or to kick out a logged in user by simply delete his session file. After installation the module is available in the Setup menu.

      The following conditions must be met for the module to work properly:
      Session files
      Session data must be stored in session files, which is the default way in ProcessWire. Sessions stored in the database are not supported by this module. The path to the directory where the session files are stored must be declared in the ProcessWire configuration which is by default: site/assets/sessions.
      Serialize handler
      In order to transform session data easier back to a PHP array, the session data is stored serialized. PHP offers a way to declare a custom serialize handler. This module supports only the default serialize handlers: php, php_binary and php_serialize. WDDX was dropped in PHP 7.4.0 and is therefore not supported by this module as well as any other custom serialize handler. Which serialize handler is actually used you can find out in the module configuration which is available under Modules=>Configure=>SessionViewer.

      Session data
      The session data can be displayed in two different ways. PHP's default output for arrays print_r() or by default for this module nice_r() offered on github: https://github.com/uuf6429/nice_r. There is a setting in the module configuration if someone prefers print_r(). Apart from the better handling and overview of the folded session data the output of nice_r() looks indeed nicer.

      Links
      ProcessWire module directory
      github.com
    • By Robin S
      Repeater Easy Sort
      Adds a compact "easy-sort" mode to Repeater and Repeater Matrix, making those fields easier to sort when there are a large number of items.
      The module also enhances Repeater Matrix by allowing a colour to be set for each matrix type. This colour is used in the item headers and in the "add new" links, to help visually distinguish different matrix types in the inputfield.
      Screencasts
      A Repeater field

      A Repeater Matrix field with custom header colours

      Easy-sort mode
      Each Repeater/Matrix item gets an double-arrow icon in the item header. Click this icon to enter easy-sort mode.
      While in easy-sort mode:
      The items will reduce in width so that more items can be shown on the screen at once. The minimum width is configurable in the field settings. Any items that were in an open state are collapsed, but when you exit easy-sort mode the previously open items will be reopened. You can drag an item left/right/up/down to sort it within the items. The item that you clicked the icon for is shown with a black background. This makes it easier to find the item you want to move in easy-sort mode. You can click an item header to open the item. An "Exit easy-sort mode" button appears at the bottom of the inputfield. Configuration
      In the field settings for Repeater and Repeater Matrix fields you can define a minimum width in pixels for items in easy-sort mode. While in easy-sort mode the items will be sized to neatly fill the available width on any screen size but will never be narrower than the width you set here.
      In the field settings for Repeater Matrix you can define a custom header colour for each matrix type using an HTML "color" type input. The default colour for this type of input is black, so when black is selected in the input it means that no custom colour will be applied to the header.
      Exclusions
      The easy-sort mode is only possible on Repeater/Matrix fields that do not use the "item depth" option.
       
      https://github.com/Toutouwai/RepeaterEasySort
      https://processwire.com/modules/repeater-easy-sort/
×
×
  • Create New...