Jump to content

CKEditor and pasting


AAD Web Team
 Share

Recommended Posts

Hi,

I've run into a bit of an issue with CKEditor on the pilot ProcessWire site we're currently working on. Ultimately it may prove to be something I need to ask about on the CKEditor pages at GitHub, but I thought I'd try here first in case I'm missing something...

The site in question is running PW 3.0.98.

In the Body field's settings we have ACF and HTML Purifier turned on. I've told it to retain non-breaking spaces but it's allowed to strip out empty paragraphs and change divs to paragraphs.

I've been happily customising the editor, adding and removing buttons, using mystyles.js and config.js to get it to behave exactly how I want it to, but now I've hit a snag...

Problem 1: When I click on the 'Paste from Word' button I get a message telling me that the browser doesn't allow it and I should use Ctrl+V (from the little I've read online this is a known issue in CKEditor and is due to browser behaviour). This wouldn't be so bad if it weren't for Problem 2...

Problem 2: When I use normal paste (Ctrl+V) and the content has come from Word, it adds in a whole bunch of inline style attributes! Every h1, h2, p etc has style="margin-left:0cm; margin-right:0cm". Yet if I try to type styles such as these, they are stripped out. I can't understand why paste is behaving differently. Is this the expected behaviour? Even after saving/publishing the page, those stubborn pasted styles remain. How does it even know the difference between what I've pasted in and what I've typed in, once the page has been saved?

I've looked into the 'pasteFilter' event in the CKEditor documentation and this seems like it could possibly allow a workaround, if I can explicitly allow certain tags during pasting, but I can't figure out if and where I can customise this in ProcessWire. I tried adding the following to config.js as a test, but it didn't make any difference:

config.pasteFilter = 'p; a[!href]';

I also tried pasting this code (taken from CKEditor documentation) into mystyles.js (a longshot I know), but that didn't work either - my thinking was I could use this method to strip out those unwanted styles:

editor.on( 'paste', function( evt ) {
    evt.data.dataValue = evt.data.dataValue
        .replace( /The/gi, 'AHH' )
} );

I'd be curious to know if anyone else here has encountered a similar issue and knows of a workaround/fix!

I should also add that I've observed the same behaviour in Firefox, Chrome and IE, and on a separate installation of PW running 3.0.101.

Thanks,

Margaret
(Web Team, Australian Antarctic Division)

Link to comment
Share on other sites

I've run into this problem on a client site before. I was never able to duplicate it myself, but noticed there were style attributes ending up in the CKE field of my client's site somehow, and the client said they pasted from Word. I don't have Word, and I'm not observing the issue occurring with pasting from anything else, so there must be some Microsoft trickery going on, who knows.  I don't understand why CKE's ACF allows it. It does seem like a CKE bug. I'm sure it's possible to configure HTML Purifier to remove that stuff, but haven't looked too closely into it.

While there is a way to force CKE to paste as plain text (see thread that flydev linked), I don't really like losing links and normal CKE formatting when I paste, because I'm often pasting from one CKE field to another, and having to redo all the formatting is a pain. Usually if I want to paste as plain text, I just use SHIFT-CMD-V on Mac (SHIFT-CTRL-V on PC), as this is a pretty universal way to paste as plain text, anywhere. But the reality is, clients aren't likely to remember it, and many are doing Edit > Paste, rather than using the keyboard. I'd rather not have the client have to think about whether they need to use plain text pasting or not. 

A simple way to remove style attributes from CKE fields is to hook after the process input, look for them, and then remove them. This hook at the top of your /site/templates/admin.php file should do the trick:

$wire->addHookAfter('InputfieldCKEditor::processInput', function($event) {
  $inputfield = $event->object;
  $value = $inputfield->attr('value');
  if(strpos($value, 'style=') === false) return;
  $qty = 0;
  $value = preg_replace('/\bstyle=(["\'])([^\1]+?)\1/i', '', $value, -1, $qty);
  if(!$qty) return;
  $inputfield->attr('value', $value);
  $inputfield->trackChange('value');
  $inputfield->warning("Stripped $qty style attribute(s) from field $inputfield->name");
}); 

Word may be adding some other weirdness too, but I don't have a good example to look at to know what else there might be (MSO-specific class names?). In any case, we could probably remove it in a similar manner. 

Edited by horst
added one ? to make it ungready, like AAD Web Team posted
  • Like 8
Link to comment
Share on other sites

Thanks @flydev and @dragan for your suggestions. I forgot to mention in my original post that I did try paste as plain text. While it works well and is a great option in some situations, it's probably a bit of a last resort for us in this instance. As @ryan has pointed out, we don't want our editors to have to remember to do it (or force them to do it), and they'll just get irritated that all their formatting disappears. They don't always paste from Word, but it happens enough for us to have to make it as simple as possible for them (but not too painful for us as the ones who then have to check their pages prior to publishing them).

16 hours ago, dragan said:

wow, PW is really going places ?

Indeed! I've been using PW for personal sites for a few years now, so when it became evident we needed to change our CMS I introduced it to the rest of the Web Team and then we managed to argue the case for it to be our new CMS! ?

Finally thanks to @ryan for your fantastic advice. I might look into HTML Purifier a little, but that hook code should be enough to get it operating how we need it. I'll give it a try when I'm back at work on Tuesday!

  • Like 2
Link to comment
Share on other sites

https://www.google.ch/search?q=ckeditor+bug+copy+and+paste+from+word

It seems like there are many others encountering that same bug.

Someone here e.g. suggests to only use "basic HTML" instead of "full HTML" https://www.drupal.org/project/drupal/issues/2940054

https://github.com/ckeditor/ckeditor-dev/issues/595

^ seems like pasting on mobile browsers is/was also an issue...

  • Like 1
Link to comment
Share on other sites

Quote

Someone here e.g. suggests to only  use "basic HTML" instead of "full HTML"

I think they are referring to a Drupal setting that maps to CKEditor's config.fullPage, for editing a full HTML document? That's not something we use. Though it's possible I'm missing something. 

I noticed CKEditor 4 is now up to version 4.9.2 (we are on 4.8.0). Maybe they have fixed the issue in CKE. I'm going to update to 4.9.2 in PW core 3.0.107 this week, hopefully that'll help. 

 

  • Like 6
Link to comment
Share on other sites

Thanks again for the hook code @ryan - it works perfectly, with just one minor change - we added a question mark to stop the regular expression being greedy. It was replacing everything between the opening of the first style on the page and the closing of the last style of the page.

$value = preg_replace('/\bstyle=(["\'])([^\1]+?)\1/i', '', $value, -1, $qty);

It's a great workaround until they fix the bug in CKEditor. I'll keep an eye out for the next core update. ?

  • Like 3
Link to comment
Share on other sites

I use the following in /site/modules/InputfieldCKEditor/config.js to disallow inline styles.

CKEDITOR.editorConfig = function( config ) {
    config.disallowedContent = '*{*}'; // All styles disallowed
};

 

  • Like 7
Link to comment
Share on other sites

  • 3 weeks later...

@cjx2240

  1. download and install the plugin in the site/modules/InputfieldCKEditor/plugins directory
  2. in the field settings, on the Input tab, section plugins, check pastefromword
  3. in the CKEditor settings > CKEditor Toolbar, add PasteFromWord
  4. save your field
  • Thanks 1
Link to comment
Share on other sites

  • 10 months later...
On 6/27/2018 at 2:23 AM, Robin S said:

I use the following in /site/modules/InputfieldCKEditor/config.js to disallow inline styles.


CKEDITOR.editorConfig = function( config ) {
    config.disallowedContent = '*{*}'; // All styles disallowed
};

Thank you @Robin S for this one!

Do you know the syntax for "disallow all styles except text-align:left|center|right|justify ? The docs only show how to allow everything except: https://ckeditor.com/docs/ckeditor4/latest/guide/dev_disallowed_content.html#how-to-allow-everything-except

Or in other words: How can I disallow all styles so that they are removed when pasted from word (which works great with your custom config rule), but allow some formattings, like text-align features?

Link to comment
Share on other sites

47 minutes ago, bernhard said:

How can I disallow all styles so that they are removed when pasted from word (which works great with your custom config rule), but allow some formattings, like text-align features?

For this I think you would not disallow styles in config.disallowedContent but would instead insert the following in the Extra Allowed Content of your CKEditor fields:

*{text-align}

 

Link to comment
Share on other sites

Since the day of forced p tags ckeditor is still haunting never getting it's settings the way a client really wants it.
TinyMCE isn't the answer either. A client doesn't need hundreds of settings anyway. Just a simple editor would do fine.

 

Link to comment
Share on other sites

1 hour ago, bernhard said:

Thx, but this does only allow P tags and strips all H, UL, etc

Maybe you have some other setting that's interfering - something in /site/modules/InputfieldCKEditor/config.js? 

It's working for me...

2019-06-13_231319.png.d939cb38c7fc68d0c3768e9d9df57c88.png

cke.gif.9543b55128173030093c1028de1fcc52.gif

  • Thanks 1
Link to comment
Share on other sites

Hi 

I am using CKEditor V4.0 , when i copy Terms and conditions which was written in word document, its not formating as wrote in word file.. 

can any one suggest me which paramters i need to set in config file which help me to use word documents formating.. 

 

Regards

Himanshu

Link to comment
Share on other sites

Hi @Himanshu Modi and welcome to the forum,

there is a reason why ckeditor does usually NOT format everything as word does... The HTML code that is usually produced by copy-pasting from word is a nightmare. You'll have all kinds of styling attributes inside your code that has several disadvantages (SEO, different stylings across your site like size, font, colors etc).

Link to comment
Share on other sites

  • 1 month later...
On 6/17/2019 at 9:27 PM, Himanshu Modi said:

when i copy Terms and conditions which was written in word document, its not formating as wrote in word file.. 

Using CTRL + SHIFT + V instead of just CTRL + V (on Windows), you can still paste the content and CKEditor will recognize basic stylings such as line-breaks and bold/italic, lists etc. without all the extra MS-Word proprietary stuff that @bernhard mentioned. That is intentional, since MS-Office documents and your website don't use the same styling, and you really don't want to end up with strange-looking parts in your website because of such author actions.

Link to comment
Share on other sites

  • 5 months later...

During a recent maintenance routine we found that our website's database (1,700+ pages) had thousands of instances of unnecessary, garbage code that had come with copied text from Word. Passages with margins expressed in points, cms and inches, and some that were wrapped in upwards of 7 spans were among the most easily identified crimes. Purging all of this dropped our database size by over 4%.

A few of the code examples above nuke all inline styles, which will impact some important out-of-the-box functionality for PW3 and CkEditor (depending on your use); specifically with many of the options with tables and lists, such as setting a column width or changing the bullet styles within a nested list.

To work around that, I made some changes to Ryan's code to target specific tags and to eliminate spans (which you can only add via Source view without pasting them in).

$wire->addHookAfter('InputfieldCKEditor::processInput', function($event) {
  $inputfield = $event->object;
  $value = $inputfield->attr('value');
  if ((strpos($value, 'style=') === false) && (strpos($value, '<span>') === false)) return;
  $count = 0;
  $qty = 0;

  // Optional remove spans
  $value = preg_replace('/<span.*?>/i', '', $value, -1, $qty);
  $value = preg_replace('/<\/span.*?>/i', '', $value, -1);
  $count = $count + $qty;

  // Remove inline styles from specified tags
  $tags = array('p','h2','h3','h4','li');
  foreach ($tags as $tag){
    $value = preg_replace('/(<'.$tag.'[^>]*) style=("[^"]+"|\'[^\']+\')([^>]*>)/i', '$1$3', $value, -1, $qty);
	$count = $count + $qty;	
  }

  if(!$count) return;
  $inputfield->attr('value', $value);
  $inputfield->trackChange('value');
  $inputfield->warning("Stripped $count style attribute(s) from field $inputfield->name");	
});

 

Edited by Arcturus
Fixed if statement on line 4
  • Like 6
Link to comment
Share on other sites

  • 2 years later...
On 1/27/2020 at 10:23 PM, Arcturus said:

During a recent maintenance routine we found that our website's database (1,700+ pages) had thousands of instances of unnecessary, garbage code that had come with copied text from Word. Passages with margins expressed in points, cms and inches, and some that were wrapped in upwards of 7 spans were among the most easily identified crimes. Purging all of this dropped our database size by over 4%.

A few of the code examples above nuke all inline styles, which will impact some important out-of-the-box functionality for PW3 and CkEditor (depending on your use); specifically with many of the options with tables and lists, such as setting a column width or changing the bullet styles within a nested list.

To work around that, I made some changes to Ryan's code to target specific tags and to eliminate spans (which you can only add via Source view without pasting them in).

$wire->addHookAfter('InputfieldCKEditor::processInput', function($event) {
  $inputfield = $event->object;
  $value = $inputfield->attr('value');
  if ((strpos($value, 'style=') === false) && (strpos($value, '<span>') === false)) return;
  $count = 0;
  $qty = 0;

  // Optional remove spans
  $value = preg_replace('/<span.*?>/i', '', $value, -1, $qty);
  $value = preg_replace('/<\/span.*?>/i', '', $value, -1);
  $count = $count + $qty;

  // Remove inline styles from specified tags
  $tags = array('p','h2','h3','h4','li');
  foreach ($tags as $tag){
    $value = preg_replace('/(<'.$tag.'[^>]*) style=("[^"]+"|\'[^\']+\')([^>]*>)/i', '$1$3', $value, -1, $qty);
	$count = $count + $qty;	
  }

  if(!$count) return;
  $inputfield->attr('value', $value);
  $inputfield->trackChange('value');
  $inputfield->warning("Stripped $count style attribute(s) from field $inputfield->name");	
});

 

This is great, thanks!

In my settings I am allowing spans with classes. Do you know how I might adapt your code to allow this? If I have HTML with <span class="foo">bar</span> but no empty spans or style assets, <span class="foo">bar</span> stays. But if I add in <span>foo</span>, <span>foo</span> changes to just foo (great!) but <span class="foo">bar</span> also change to bar.

Also, by default, CKEditor seems to allow style="margin: " I assume this is related to indentation. Any idea how you can disable this? Extraneous spans are annoying but harmless. However, the margin styles mess the formatting up. I do not even have the indentation buttons visible in the toolbar.

Thanks!

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...