patrick Posted October 8, 2022 Share Posted October 8, 2022 How can I prevent HTML Purifier from converting <br> to <br /> ? Which hook and HTML Purifier config setting is needed? Link to comment Share on other sites More sharing options...
BrendonKoz Posted October 10, 2022 Share Posted October 10, 2022 Using the HTMLPurifier demo page, if you change the defined Doctype to either of the standard HTML Doctypes, the BR tag will not use the XHTML formatted version. That said, there may be other unforeseen consequences in making this change. If this is simply an aesthetic preference, I would personally recommend leaving it alone. If there's another reason for it, I'm thinking that a call to str_replace on your data would be far simpler. If you do decide to customize HTMLPurifier, a good place to start for examples would be here and here. Here's an example set call that would change the Doctype to HTML Transitional, adjusted from the PW API Documentation example: $wireData = $markupHTMLPurifier->set('HTML.Doctype', 'HTML 4.01 Transitional'); 2 Link to comment Share on other sites More sharing options...
Robin S Posted October 10, 2022 Share Posted October 10, 2022 In the context of CKEditor the <br /> tag is probably caused by the CKEditor settings rather than HTML Purifier. Using this answer as a reference, you could put the following in /site/modules/InputfieldCKEditor/config.js: // When CKEditor instance ready CKEDITOR.on('instanceReady', function(event) { // Output self-closing tags the HTML5 way, like <br> event.editor.dataProcessor.writer.selfClosingEnd = '>'; }); 2 Link to comment Share on other sites More sharing options...
patrick Posted October 10, 2022 Author Share Posted October 10, 2022 1 hour ago, Robin S said: In the context of CKEditor the <br /> tag is probably caused by the CKEditor settings rather than HTML Purifier. Using this answer as a reference, you could put the following in /site/modules/InputfieldCKEditor/config.js: // When CKEditor instance ready CKEDITOR.on('instanceReady', function(event) { // Output self-closing tags the HTML5 way, like <br> event.editor.dataProcessor.writer.selfClosingEnd = '>'; }); Hi Robin Thanks for your answer. I tried it. With the above solution in the CKEditor config, looking at the source of CKEditor br tags are shown as <br>, but unfortunately are still saved in the database as <br /> Link to comment Share on other sites More sharing options...
patrick Posted October 10, 2022 Author Share Posted October 10, 2022 2 hours ago, BrendonKoz said: Using the HTMLPurifier demo page, if you change the defined Doctype to either of the standard HTML Doctypes, the BR tag will not use the XHTML formatted version. That said, there may be other unforeseen consequences in making this change. If this is simply an aesthetic preference, I would personally recommend leaving it alone. If there's another reason for it, I'm thinking that a call to str_replace on your data would be far simpler. If you do decide to customize HTMLPurifier, a good place to start for examples would be here and here. Here's an example set call that would change the Doctype to HTML Transitional, adjusted from the PW API Documentation example: $wireData = $markupHTMLPurifier->set('HTML.Doctype', 'HTML 4.01 Transitional'); Hi Brendon Thanks for your answer. The reason is to make the source validate as html5 at w3c. I was thinking about str_replace too, but was wondering, if there is a nicer way to save it already in the database as html5. I tried the following in the admin.php: $wire->addHookAfter('MarkupHTMLPurifier::initConfig', function(HookEvent $event) { $def = $event->arguments(1); $this->settings->set('HTML.Doctype', 'HTML 4.01 Transitional'); }); but that doesn't seem to do the trick ? Link to comment Share on other sites More sharing options...
Robin S Posted October 11, 2022 Share Posted October 11, 2022 On 10/10/2022 at 9:00 PM, patrick said: I tried the following in the admin.php: $wire->addHookAfter('MarkupHTMLPurifier::initConfig', function(HookEvent $event) { $def = $event->arguments(1); $this->settings->set('HTML.Doctype', 'HTML 4.01 Transitional'); }); but that doesn't seem to do the trick @patrick, try this: $wire->addHookAfter('MarkupHTMLPurifier::initConfig', function(HookEvent $event) { $settings = $event->arguments(0); $settings->set('HTML.Doctype', 'HTML 4.01 Transitional'); }); For this to take effect you'll also need to clear the HTML Purifier cache which you can do by executing the following once (the Tracy Debugger console is useful for this sort of thing): $purifier = new MarkupHTMLPurifier(); $purifier->clearCache(); 1 Link to comment Share on other sites More sharing options...
patrick Posted October 11, 2022 Author Share Posted October 11, 2022 1 hour ago, Robin S said: @patrick, try this: $wire->addHookAfter('MarkupHTMLPurifier::initConfig', function(HookEvent $event) { $settings = $event->arguments(0); $settings->set('HTML.Doctype', 'HTML 4.01 Transitional'); }); For this to take effect you'll also need to clear the HTML Purifier cache which you can do by executing the following once (the Tracy Debugger console is useful for this sort of thing): $purifier = new MarkupHTMLPurifier(); $purifier->clearCache(); Hi Robin Cool, this is working ?. Many thanks for taking the time! The only problem now: due to the html4 value other elements are getting converted, for example <figure> gets deleted (what shouldn't happen). Hopefully there will be a HTML.Doctype html5 in the future ?. Have a nice day and Greetings from Switzerland to New Zealand Patrick 1 Link to comment Share on other sites More sharing options...
BrendonKoz Posted October 12, 2022 Share Posted October 12, 2022 Unfortunately from discussions I've seen, it's either unlikely that HTMLPurifier will be implementing a supported HTML5 Doctype, or it will be quite awhile before one arrives. They're taking contributions, but the amount of work (and surrounding understanding) seems daunting. The removal of unsupported elements would be part of the unforeseen consequences I mentioned. I'd still recommend using a call to str_replace instead if you do decide to stick with this. That said, using <br/> or <br /> is not invalid, and is allowed. You're seeing info messages in the W3C validator service, not notices or warning messages. It's because HTML5 doesn't require (like XHTML did) that element attribute values are contained within quoted strings, and if an element using unquoted values ended with a trailing slash and no word boundary, it could cause confusion in the browser. See their example for more info: https://github.com/validator/validator/wiki/Markup-»-Void-elements#trailing-slashes-directly-preceded-by-unquoted-attribute-values This was a great exercise in how to customize the HTMLPurifier and CKEditor components, but I'd personally recommend, at least in this instance, not making that change. That said, it's completely up to you, and you have a working solution for your target goal now! 1 Link to comment Share on other sites More sharing options...
Robin S Posted October 13, 2022 Share Posted October 13, 2022 On 10/11/2022 at 11:23 PM, patrick said: The only problem now: due to the html4 value other elements are getting converted, for example <figure> gets deleted (what shouldn't happen). One more option... You could copy MarkupHTMLPurifier from /wire/modules/Markup/ to /site/modules/ and then select it as the copy you want to use. Then edit HTMLPurifier.standalone.php to replace this code with: return '<' . $token->name . ($attr ? ' ' : '') . $attr . '>'; Seems to solve the slash issue without affecting HTML5 elements like <figure> 1 Link to comment Share on other sites More sharing options...
patrick Posted October 13, 2022 Author Share Posted October 13, 2022 6 hours ago, Robin S said: One more option... You could copy MarkupHTMLPurifier from /wire/modules/Markup/ to /site/modules/ and then select it as the copy you want to use. Then edit HTMLPurifier.standalone.php to replace this code with: return '<' . $token->name . ($attr ? ' ' : '') . $attr . '>'; Seems to solve the slash issue without affecting HTML5 elements like <figure> Hi Robin Thanks a lot for your answer. I copied MarkupHTMLPurifier to /site/modules/ and made the changes. I like this approach and I will stick with your solution! Thanks again and have a nice day! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now