Jump to content

Field validators / validation framework


mindplay.dk
 Share

Recommended Posts

We have an increasing number of Fieldtypes and Inputfields, and I just had a realization: validation is baked into Fieldtypes and Inputfields, wherever it fits.

That seems wrong? Or at best, rather arbitrary - these concerns seem to get baked in wherever they might fit.

So now we end up with things like InputfieldEmail, InputfieldURL and FieldtypeTextUnique, all of which have various types of validations baked into them.

Well, now suppose I want a field that validates both as unique and as an e-mail address? That's two validations. I would need to write another class, and even then I might have to duplicate some code to make that happen. Basically, any arbitrary combination of validations would require a new type.

Have you thought about adding a validation framework of some sort? Allowing for multiple validations on the same Field, configurable error-messages, etc.?

This might reduce the number of Inputfields and Fieldtypes considerably? A lot of fields are just some variety of a text-field, with different validation - it would be nice to be able to apply validations such as uniqueness to either e-mail, URL, integer, Page-reference, etc.

It would also be nice to be able to write site-specific validations and reuse them across multiple fields on some sites.

A configurable general-purpose validator for regular expressions would be another possibility - this could be extremely useful, for example when a field should accept URLs referencing a specific site, like YouTube or Vimeo.

Componentized, modular, and configurable validation is something most major CMS have - is something like this in the stars for ProcessWire?

Link to comment
Share on other sites

You don't need another type or class just for different validations of a text field, you can hook into the processInput of the inputfield and add your own validation and add error if needed.  Rare cases where I need this but it's great and flexible way. There's also filter regex you can add in the field settings for validation but maybe little limited.

Maybe some sort of module with common validations methods that can be added or configured along with multilanguage (of course!) error messages would be a nice addition. This could then be added to a field or through a hook.

Link to comment
Share on other sites

I realize you can do your own validations using code and hooks - I'm talking about making those validations visible and configurable from the UI.

But having a real framework for componentized validation and error-reporting wouldn't suck when working with code either - some field-types (such as integer) currently just "sanitize" values, quietly erasing them, rather than producing a user-friendly error-message. And again, you can hard-code all of that per Fieldtype and/or Inputfield, but that just doesn't seem like a very structured approach to me...

Link to comment
Share on other sites

I agree with you that it seems like a good idea to incorporate a validation component, that can be used flexibly within fieldtypes, inputfields or even on the front-end. On the admin side there even could be some UI to defining validations etc.

Something like http://documentup.com/Respect/Validation/ seems like a nice one, but i don't know if that would work with PW.

Link to comment
Share on other sites

Greetings,

Just an idea...

If you are building front-end forms using the ProcessWire API, you could pull in an external library.  I've started using Zebra Form lately, which has a great API for building the forms and doing client- and server-side validations.  I really like it.

Here's a link: http://stefangabos.ro/php-libraries/zebra-form/

Thanks,

Matthew

Link to comment
Share on other sites

I'm sure that's great for public-facing pages that don't have anything to do with ProcessWire as such - what I'd like to see though, is a more complete solution to validation in ProcessWire itself, for validating content entered by content managers.

Link to comment
Share on other sites

I'm all for more validation options, and generally add them as needs come up. But I want to point out that Fieldtypes define a database schema. That's something that differentiates the need for different fieldtypes, is the fact that most represent different storage needs for data. Behind the scenes, they represent a database table. You don't need different fieldtypes purely for validation reasons. Though FieldtypeEmail and FieldtypeURL (and I'm sure others) probably have very similar storage needs... and maybe they could really be the same Fieldtype if we really wanted it that way, but I'd rather have people choosing an "email" or "URL" field rather than a "email/URL" or generic "text" field. This is more consistent with the way HTML5 is going too, as types like "email" and "date" are being used over some validation attribute on text fields. Overall, I think it's easier on the user if they are choosing reasonably well defined types when creating fields, rather than abstracted "I can be anything" types. 

As for a "unique" validation, that's different from the others. This would best be handled at the Fieldtype level since it's a type of database index that enforces it. I've been planning to add this to the base Fieldtype class, so that it is available for all Fieldtypes. You may have seen the FieldtypeTextUnique, which was more about proof of concept and solving a need quickly, rather than a suggestion of what path should be taken overall. But we'll definitely be adding support for unique indexes in a manner that spans all Fieldtypes rather than just individually. 

Fieldtypes are involved in the getting and setting of page variables at the API level, making sure that the format is correct for the storage it represents. This is sanitization, just making sure that what gets set to it is valid in general for the type and suitable for storage. But specific interactive validations are the job of Inputfield modules. If there are any major validations we're missing on any of them, let me know and we can get it on the to-do list. I have been working to expand the validation options behind the InputfieldText module, as it now supports regular expressions, which will let you be as specific as you want with validation. New validations have also been added to several other Inputfields over the last year. 

Configurable error messages at the field-by-field level are something we don't have, but I would also like to see them. This is something I think we can add in the near future. They are currently configurable at the fieldtype-by-fieldtype level, but only if you have the multi-language support modules installed. 

  • Like 4
Link to comment
Share on other sites

That all sounds great, Ryan :)

The only thing I have to disagree with, is the idea that fields, which are concerned with storage, should be concerned about the type of data contained in a string. A string is a string - as I think you will realize when you start thinking/designing for validations. I believe you will find it's actually eaiser to deal with strings when they are consistently represented, managed and stored by the same type.

Your HTML5 example is interesting - yes, URL and e-mail inputs are represented by distinct input-types. That is because HTML forms deal with validation, which is an input concern, not a storage concern - the fact that the value-attribute is always a string demonstrates that fact, because the value-attribute deals with storage. You also see evidence of this when you post the form - a text-input, URL, e-mail or textarea all get posted as the same data-type, a string; nothing indicates what the data-type on the form was, because that's an input concern, not a storage concern.

You can see this also in .NET, where data-types like e-mail and URL are defined by annotating a field with it's data-type, rather than by extending the String class into Email and URL classes - it was designed that way for the same reason: because you need the String type to express a storage concern, while the data-type is used for input-concerns such as validation.

Between storage and input concerns, some concerns may often seem to overlap - but it's important to distinguish "hard" storage concerns from "soft" input concerns. To use the simplest possible practical example, consider the "required" validation, an input concern, also known as "nullable", a storage concern - an Inputfield, dealing with input, might produce a nice human-readable error-message, while the Fieldtype, dealing with storage, really ought to throw an exception if the "nullable" constraint fails, since this has nothing to do with input. When dealing with storage, the consumer is the software - you should assume there is no user, because in some cases there really won't be.

Having to sanitize away invalid values in Fieldtypes is a symptom of this - if you handle input concerns at the storage level, you don't have much of a choice, short of throwing an exception, which isn't very user-friendly.

Another interesting overlapping concern is "uniqueness" or "distinct values" - this almost has to be implemented at the storage-level, and if I were to programmatically violate that constraint, I would expect an exception to be thrown. Rather than duplicating the uniqueness property at the input-level, an input-type should be able to go back to it's underlying storage-type via a standard interface and obtain a list of validations that are required to satisfy the "hard" constraints imposed by the storage-type, in addition to the "soft" constraints imposed by the input-type.

I realize changing any of this would break backwards compatibility, since you already have numerous storage-types that deal with input, so maybe you should consider pushing off the validation architecture to a major release, and consider cleaning up this portion of the API?

Link to comment
Share on other sites

Take some of that with a grain of salt, please - I just jotted the whole thing down on a diagram, and some of what I said doesn't quite make sense.

Mainly, I think I had to put it on a diagram to realize that URL, e-mail, color, and other string-based types can be correctly modeled as storage-types, if you wish - validating correctly formed URLs or e-mail addresses at the storage-level (Fieldtype) for example, is perfectly sound.

I will try to finish this diagram to better illustrate what I'm trying to say.

Link to comment
Share on other sites

A string or something like FieldtypeText probably isn't the best example to consider when talking about overall Fieldtype design. The Fieldtype architecture is built around being able to support complex data types that might translate to a DB schema with numerous fields within the table. FieldtypeText is the absolute simplest case scenario for a Fieldtype.
 
With the HTML5 example, I'm only talking about definition. I want people to be able to define their fields in a manner that bears some consistency with how they might define their fields in HTML5. Having an 'anything goes' text field that we later check boxes for what we want just doesn't fit the way I'd like users to define fields. I think that's fine for specific details like "max length" or the like, but not for something defining a type like Email or URL. Such an approach would mean we'd likely need to have yet another level of plugins below type, for type validation. This method would be supported by the current architecture if someone wanted to take the approach (it's not far off from how Textformatter plugins work), but I think it would ultimately be a less desirable approach for the core fields.
 
Having to sanitize away invalid values in Fieldtypes is a symptom of this - if you handle input concerns at the storage level, you don't have much of a choice, short of throwing an exception, which isn't very user-friendly.

It's important to consider that ProcessWire has two different input levels that have distinct needs: interactive and API. Inputfields aren't involved with API-level usage of ProcessWire, only interactive usage.  API-level usage of ProcessWire is another type of input being provided to the system, but one that is certainly different than interactive. This level is more about protecting the type and preventing corruption. As a result, Fieldtypes typically sanitize and Inputfields validate. But if your Fieldtype benefits from some kind of API-level validation, you always have that option. The architecture of Fieldtypes and Inputfields has a whole lot more to do with being scalable to complex data types and needs, and flexible enough to handle diverse situations. For instance, if you wanted to have a FieldtypeAnyText with separate validators, it would be easy to implement under the current architecture.

Another interesting overlapping concern is "uniqueness" or "distinct values" - this almost has to be implemented at the storage-level, and if I were to programmatically violate that constraint, I would expect an exception to be thrown.
 
Luckily this is exactly what a MySQL unique index does. It would have to be a Fieldtype-level setting rather than an Inputfield-level setting, since the Fieldtype controls the schema. 
  • Like 1
Link to comment
Share on other sites

Okay so we're not just talking about "public-facing pages" but if there were to be a fancy validation tool I suggest that it be capable of helping with that side of things too. Something to assemble JS validation scripts and corresponding server side validation, in a neat programmatic way.

Doing the easy validations client side is great. You still have to sanitize and check when the data comes in but doing it client side is so quick. I've used PEAR QuickForm (kind of a beast in some ways). That has a generic JS validation library (load it once) and then it generates a small script specific to your form. Of course you have to tell it what rules need to be applied to what. Sometimes it's just as easy to do it all by hand but it can be pretty neat for big forms. Maybe do the low hanging fruit (regex tests mostly) and use callbacks or something for anything else.

Link to comment
Share on other sites

I'm not opposed to writing code. I'm actually not a fan of everything needing a graphical user interface, and validation is a good example of one of those features where you can expand indefinitely and never actually meet everybody's needs. It's also a good example of functionality that tends to be less generic and more application-specific.

How about a simpler API extension, like a type of module (or module-interface) that can validate a Page instance?

The UI would consist only of a list of checkboxes (one for each validator module) that you can select when editing a Template - no graphical configuration for validators, just a simple on/off switch that indicates whether it applies.

The interface would be something along the lines of validate(Page $page, ValidationErrorList $errors) and possibly a boolean argument indicating whether the Page was edited via the admin UI or via a custom (public facing) form.

This much simpler approach would enable you to build out not only individual Field validations, but also co-dependent validations, for example: option A is valid only when checkbox B is checked, etc...

Thoughts?

Link to comment
Share on other sites

I like what you are saying about the dependent validations, and something that validates a whole page sounds interesting. Though since fieldtypes and inputfields are of types that may be completely unknown to the page (since they are plugins), and independent from one another, the core has to keep the validation connected with the fields. But as an external validation option I think it sounds interesting. 

Link to comment
Share on other sites

  • 4 weeks later...

Just to throw a little stone in the pond: I know it is possible to define FieldTypes and InputFields and with that create our own hooks and validations.. Would it be reasonable to consider simplifying this approach? Allowing an core FieldType/InputField to be augmented by something like a specialized module (or even code added through a field) at the Field definition level?

Link to comment
Share on other sites

Allowing an core FieldType/InputField to be augmented by something like a specialized module (or even code added through a field) at the Field definition level? 

Technically everything can already be hooked into, return values modified, or methods replaced, the most likely candidates being Inputfield::render() and Inputfield::processInput(). But admittedly I can't think of an occasion where I'd need to do that. Fieldtypes are defining data that goes into the database and needs a schema, while Inputfields are handling that input at the UI level. An "anything goes" type Inputfield where you override the render() and/or processInput() with hooks would likely only be useful for situations where you want to substitute the way the input is performed at the UI level... but that's the purpose of Inputfields in the first place, so the question would be: why not just make an Inputfield? Perhaps there are some cases, like if you wanted to replace all <select> fields with a jQuery driven Chosen field or something. As for simplifying the approach, it's actually really very simple and a lot more so than what I think most people realize. But when people are making their own Fieldtype/Inputfields it's usually because they are trying to meet some data storage or input need that is not fundamentally simple in any system. In my opinion it's simpler and more reliable to handle such a need using the pattern made for it (i.e. custom Fieldtypes and/or Inputfields), rather than to selectively override things. 

  • Like 1
Link to comment
Share on other sites

An "anything goes" type Inputfield where you override the render() and/or processInput() with hooks would likely only be useful for situations where you want to substitute the way the input is performed at the UI level... but that's the purpose of Inputfields in the first place, so the question would be: why not just make an Inputfield? 

The reason most UI/form/validation frameworks allow for some kind of composition, is that inheritance is hierarchical, and validation concerns aren't always hierarchical.

You can anticipate basic validations such "required" and build that into the base-class of all input-types, but you can't anticipate every possible cross-cutting validation concern.

To give a concrete example, class A might validate that the input looks like an e-mail address. Class B might extend A and also validate that the e-mail address is at a particular domain. Class C might extend A and validate that the e-mail address does not contain the word "noreply".

Now if class D needs to validate all of the above, suddenly you have an interesting problem - you can't extend both B and C, so now you have to refactor and most likely use static methods or code duplication to solve this issue.

Plus you end up with crazy class names like InputfieldEmail, InputfieldEmailAtDomain, InputfieldEmailAtDomainNotNoReply, etc.

For validation concerns, composition just seems like a more natural choice than aggregation, and it is the general approach taken to validation by most frameworks.

Link to comment
Share on other sites

  • 5 years later...

Wouldn't it be a good step in the right direction to validate the HTML5 patterns via PHP too.  They are valid regex and it should be not too hard to process them via preg_match (). Selecting sanitation and validation modules like the textformatters would be really cool too. And it wouldn't be too hard to  put together a collection of both.  You could even use textformatters for that(at least for sanitation). This would be even helpful for other fields like certain currencies and date formats, where you could correct some   bad formatted input. 

Building tons of fields seems not really comfortable to me though it is an option.  Just using the HTML5 patterns would relieve us from a lot of work and even when its not a perfect solution it would help a lot. 

Having an easy and quick way to do some simple custom validations/sanitations in text fields is one of the last few things i really miss in PW. 

 

  • Like 1
Link to comment
Share on other sites

Addition:  I found that the Pattern field already states that it would do validation serverside too. 

I created a field PLZ which is german postal/zip code Added a pattern [0-9]+

Put it into an existing page/template  and tried to enter letters in the backend , the HTML5 validation reacted as it should.

I use firefox so i deactivated validation by using this plugin :  https://addons.mozilla.org/en-US/firefox/addon/defyformvalidation/

Now i save the page and get a PLZ of 5666a for example.  

I already tested this quite a while ago and that was the reason while i wrote the above comment , sorry for not reading the field description closely , now i know that it should validate but still it does not do it. 

Another thing is about the error message the browser simply displays the default message like "Please match the requested format" this is very user unfriendly. Concerning  https://stackoverflow.com/questions/10361460/how-can-i-change-or-remove-html5-form-validation-default-error-messages
It should be possible to implement custom messages in HTML5, btw. that would be great in serverside validation too.

I mainly use PW for building small DB applications so userfriendlyness in BE is very important for me. If You use it mainly for Websites its not that important , but still i would find it confortable to have an error message available from my field settings when building frontend templates. It would be even better , if the field would be multilanguage ?

Another issue whith BE forms and HTML5 validation is that the error messages only appear one after another. So user corrects one field, presses send and gets the next error presented , if you got 10 errors that takes a while ....

The Problem is very good described in this article plus they offer a solution:
https://www.tjvantoll.com/2012/08/05/html5-form-validation-showing-all-error-messages/

@admins
If you think this post belongs into a new thread, feel free to move it, i am uncertain. 

 

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...