Jump to content
iipa

Sanitize hidden characters from CKEditor

Recommended Posts

I have a CKEditor Textarea in a page template. Some users like to add text in them by pasting from Word document. This leads to internal server error when saving page. When using paste without formatting (cmd + shift + V), page is saved normally, so I assume error has something to do with Word's hidden characters that cause issues in many other programs as well. (I don't have Word myself, so I debugged this with video chat with user. I forgot to ask to check code view, so I'm not sure if they are visible there.)

Is there a way in ProcessWire/PHP to sanitize Textarea input from these hidden characters, or can I prevent this by changing editor settings (listed below, if it helps)? I don't like leaving error handling rely to user action - somebody always forgets to do things specific way and it weakens user experience.

Textarea
	formatting: none (htmlspecialchars off)
	field type: CKEditor
	content type: markup/html
	experimental markup/html settings: all on
	acf: on
	html purifier: on
	additional purify settings: all on
	extra allowed content: none
	add-ons: pwimage, pwlink, sourcedialog
	sourcedialog settings: none
	disabled add-ons: image, magicline

 

Share this post


Link to post
Share on other sites

Hi @iipa, here are some useful links:

I use the hook from Ryan since a few weeks and it works really good:

ms-word-stripper-screenshot.thumb.jpg.ff55d959d6c4c03533cff350dec29ac5.jpg

Maybe one can this adapt to match more than inline styles. (?)

 

  • Thanks 1

Share this post


Link to post
Share on other sites

@iipa Pasting stuff from Word should be no problem ... I think most of our clients likely do this. So you definitely shouldn't get an internal server error from PW at least. Though you might try setting $config->debug=true; in your /site/config.php file (temporarily) and trying again, just in case — that will make it produce a verbose error message. But it's more likely that it is coming from mod_security or some other Apache or PHP module on the server that is monitoring input and halting the request when it comes across something it doesn't like. I don't blame it, as MS word can produce some pretty sketchy looking markup. But between CKEditor and htmlpurifier, PW should be able to clean it up just fine once the server lets it through. But if it's an Apache/PHP module doing this (which seems likely) then nothing you adjust in PW can fix it since the module examines the request before PW even boots, so you'd instead have to disable or configure the Apache module (mod_security, suhosin, or whatever it might be). 

  • Like 1
  • Thanks 1

Share this post


Link to post
Share on other sites
On 8/17/2019 at 3:16 PM, eydun said:

Hi iipa

Does the user use the "Paste from Word"-button?

image.thumb.png.548e13ec7a5ceb324db5198158ea7980.png

 

Oon some browser+OS combos that button does not work, unfortunately.

6 hours ago, ryan said:

@iipa Pasting stuff from Word should be no problem ... I think most of our clients likely do this. So you definitely shouldn't get an internal server error from PW at least. Though you might try setting $config->debug=true; in your /site/config.php file (temporarily) and trying again, just in case — that will make it produce a verbose error message. But it's more likely that it is coming from mod_security or some other Apache or PHP module on the server that is monitoring input and halting the request when it comes across something it doesn't like. I don't blame it, as MS word can produce some pretty sketchy looking markup. But between CKEditor and htmlpurifier, PW should be able to clean it up just fine once the server lets it through. But if it's an Apache/PHP module doing this (which seems likely) then nothing you adjust in PW can fix it since the module examines the request before PW even boots, so you'd instead have to disable or configure the Apache module (mod_security, suhosin, or whatever it might be). 

I actually have debug on, since we haven't launched yet, but I'm also starting to lean into problem being in server side configurations.

On 8/17/2019 at 7:39 PM, horst said:

I use the hook from Ryan since a few weeks and it works really good

Maybe one can this adapt to match more than inline styles. (?)

I managed to accomplish this in CKEditor's config.js by adding

config.disallowedContent = '*{*}';

which does it automatically when pasting. It was also enough to keep server happy aswell!

  • Like 1

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Guy Incognito
      Hi all,
      Having a strange problem with my CKeditor custom styles.
      Trying to add standard bootstrap classes to the image alignment options. But for some reason if I add multiple classes to the centred image option it disappears from the editor drop down. But with only one class it works.
      So this works:
      { name: 'Left Aligned Photo', element: 'img', attributes: { 'class': 'float-md-left img-fluid' } }, { name: 'Right Aligned Photo', element: 'img', attributes: { 'class': 'float-md-right img-fluid' } }, { name: 'Centered Photo', element: 'img', attributes: { 'class': 'img-fluid' } }, But this doesn't:
      { name: 'Left Aligned Photo', element: 'img', attributes: { 'class': 'float-md-left img-fluid' } }, { name: 'Right Aligned Photo', element: 'img', attributes: { 'class': 'float-md-right img-fluid' } }, { name: 'Centered Photo', element: 'img', attributes: { 'class': 'w-100 img-fluid' } }, Note the slight difference to the last line.
      I've tried escaping the hyphens in case it was that but doesn't help.
      Any ideas.... #puzzled!
    • By Gadgetto
      Hello,
      is it possible to configure CKEditor to have syntax highlighting enabled in Source and/or Sourcedialog? Coming from MODX i had this feature enabled and now I'm trying to find a solution for PW too.
      I'd like to have both the WysiWyg Editor and the Source editor with syntax highlighting enabled in on field.
      Andy plugins to achieve this?
      Greetings,
      Martin
    • By Smirftsch
      I've got some odd problem adding additional styles to mystyles.js. After reading some and following the article here: https://github.com/ryancramerdesign/ProcessWire/blob/dev/wire/modules/Inputfield/InputfieldCKEditor/README.md#custom-editor-js-styles-set
      I was able to add a custom style. So far so good 😉
      It is displayed in the Styles menu and can be selected there. Now the odd thing starts, if I add a custom style like this:
       
      ... { name: 'Box Top', element: 'span',attributes: { 'class': 'box_top' } }, { name: 'Box Bottom', element: 'span', attributes: { 'class': 'box_bottom' } }  
      it works and can be selected just fine.
      However, if I add this:
      { name: 'Box Top', element: 'div',attributes: { 'class': 'box_top' } }, { name: 'Box Bottom', element: 'div', attributes: { 'class': 'box_bottom' } } It is also displayed, can be selected - BUT - once the edit is saved, it is gone, won't be displayed in the page and won't be shown anymore in the editor as selected style, it goes back to "normal".
       
      Can anyone give me a hint what I missed?
    • By Guy Incognito
      I added some custom styles to the CKeditor menu bar using the example mystyles.js and the PW tutorial. This worked fine for fields when editing on the frontend. But none of our custom styles showed in the backend editor dropdown unless we edited the core copy of mystyles.js in wire/modules.
      Is this correct behaviour, a bug or a mistake on my part? Tried clearing cache, logging in/out etc but the backend ignores our custom styles in the site/modules path.
    • By Robin S
      A pet hate of mine is when an editor uses a paragraph of bold text for what ought to be a heading. When I need to tidy up poorly formatted content like this I will quickly change such lines of text into the heading of the appropriate level, but that still results in markup like...
      <h2><strong>Some heading text</strong></h2> The <strong> has no business being there, but it's a bit of a hassle to remove it because you have to drag a selection around the exact text as opposed to just placing your cursor within the line. That gets tedious if you have a lot content to process.
      I figured there has to be an easier way so started looking into the ACF (Advanced Content Filter) features of CKEditor. What I wanted is a rule that says "strong tags are disallowed specifically when they are within a heading tag". (I guess there could occasionally be a use case where it would be reasonable to have a strong tag within a heading tag, but it's so rare that I'm not bothered about it).
      With the typical string format for allowedContent and disallowedContent there is no ability to disallow a specific tag only when it is within another specific tag - a tag is allowed everywhere or not at all. But I found there is an alternative object format for these rules that supports a callback function in the "match" property. So I was able to achieve my goal with the following in /site/modules/InputfieldCKEditor/config.js:
      CKEDITOR.editorConfig = function(config) { config.disallowedContent = { // Rule for the <strong> element strong: { // Use "match" callback to determine if the element should be disallowed or not match: function(element) { // Heading tag names var headings = ['h1', 'h2', 'h3', 'h4', 'h5', 'h6']; // The parent of the element (if any) var parent = element.parent; if(typeof parent !== 'undefined') { // If there is a parent, return true if its name is in the heading names array return headings.indexOf(parent.name.toLowerCase()) !== -1; } else { // There is no parent so the element is allowed return false; } } } } };  
      Another tip: if you want to debug your allowedContent or disallowedContent rules to make sure they are being parsed and applied successfully you can log the filter rules to the console. For convenience I used /site/modules/InputfieldCKEditor/config.js again.
      // Get the CKEditor instance you want to debug var editor = CKEDITOR.instances.Inputfield_body; editor.on('instanceReady', function() { // Log the disallowed content rules console.log(editor.filter.disallowedContent); });  
×
×
  • Create New...