Jump to content
JoshoB

Stripping empty paragraphs

Recommended Posts

Greetings! I'm trying to develop a module that automatically strips out empty paragraphs from a textarea (CK Editor) field. I don't think I did anything wrong, but the below code doesn't seem to have any effect. Can anyone tell me what I'm doing wrong? 

class TextformatterStripEmptyParagraphs extends Textformatter 
{
	public static function getModuleInfo() 
	{
		return array(
			'title' => 'Strip empty paragraphs',
			'version' => 1,
			'summary' => "Removes empty paragraphs from a field, including <p></p> and <p> </p>. Apply after all other text formatters have been applied to a field."
		);
	}

	public function format(&$str) 
	{
		$pattern = "#<p[^>]*>(\s| |</?\s?br\s?/?>)*</?p>#"; 
		$str = preg_replace($pattern, '', $str);		
	}
} 

I figured this ought to work, but it doesn't. It's getting applied after all other Textformatters (Video Embed for YouTube by Ryan, my own Lightbox module, and SmartyPants). My Lightbox module works fine with preg_replacing stuff from image tags, so this ought to work, too. But whenever someone has entered an empty paragraph (with breaking space added by CK Editor), it just doesn't do anything. 

Share this post


Link to post
Share on other sites

Hi JoshoB

I think you need to escape your '/' characters. Could you try this and let us know if it works;

$pattern = '#<p[^>]*>(\s| |<\/?\s?br\s?\/?>)*<\/?p>#';
  • Like 2

Share this post


Link to post
Share on other sites

On mobile, but isn't there a setting for that already built in at the field?

Share this post


Link to post
Share on other sites

I think you need to escape your '/' characters. Could you try this and let us know if it works;

I also hate empty paragraphs so gave this a try. Works well for me.

On mobile, but isn't there a setting for that already built in at the field?

This is an interesting one. There is a setting, but it doesn't work as intended. With the help of Tracy Debugger I did a bit of investigating as to why but haven't got to the bottom of it yet. The line intended to replace empty paragraphs in InputfieldCKEditor is this:

$value = str_replace(array('<p><br /></p>', '<p> </p>', '<p></p>', '<p> </p>'), '', $value);

But it doesn't match the empty paragraphs because $value has already passed through HTML Purifier where   gets replaced with some mystery space character. So neither 'nbsp;' nor ' ' match the space character between the paragraph tags. I haven't been able to work out what this space character is because it's rendered like a normal space in the variable dump.

---

Update: the mystery character is a UTF-8 encoded non-breaking space character. So the code above should instead be:

$value = str_replace(array('<p><br /></p>', '<p> </p>', "<p>\xc2\xa0</p>", '<p></p>', '<p> </p>'), '', $value);

Double quotes are needed around the string with the UTF-8 non-breaking space. I'll submit a pull request for this fix.

  • Like 7
  • Thanks 1

Share this post


Link to post
Share on other sites

:empty works only there is literally nothing between the tags. I don't know if this is the case here, but most of the cases I can't use it for such purpose.

  • Like 1

Share this post


Link to post
Share on other sites

Thanks for all the suggestions everyone!

Update: the mystery character is a UTF-8 encoded non-breaking space character. So the code above should instead be:

$value = str_replace(array('<p><br /></p>', '<p> </p>', "<p>\xc2\xa0</p>", '<p></p>', '<p> </p>'), '', $value);

Double quotes are needed around the string with the UTF-8 non-breaking space. I'll submit a pull request for this fix.

Wow, that was the thing that was tripping me up: the UTF-8 encoded non-breaking space. Thanks so much! Works perfectly now. 

  • Like 1

Share this post


Link to post
Share on other sites

Has anyone tried p:empty{ display:none; }?

:empty works only there is literally nothing between the tags. I don't know if this is the case here, but most of the cases I can't use it for such purpose.

Besides, it's more like a hack than a solution, anyway ;)

Share this post


Link to post
Share on other sites

This one caught me today 🙂 Thx to @netcarver and @Robin S I came up with a working solution:

$html = $page->getUnformatted('rr_cke');
$html = preg_replace('!<p[^>]*>([\xc2\xa0]*|\s| |<\/?\s?br[^>]*\/?>)*<\/?p>!', "", $html); // remove empty paragraphs
$html = $sanitizer->markupToText($html); // remove all tags
$html = preg_replace('![\r\n]+!', "\n", $html); // replace multiple newlines
db($html);

xfB6zLT.png

Share this post


Link to post
Share on other sites

Here you go:

 

Not sure if it should also strip &nbsp; ?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...