Jump to content

Can't Get Rid of <p> </p> from $page->body


prestoav
 Share

Recommended Posts

Hi all,

This is driving me a little nuts so I hope someone has an idea what I'm doing wrong!

I have a marketplace site built on PW where users upload their own ads including a text description ($page->body) that is a CKEditor Textarea field. I may change this to a straight Textarea field in future but for now CKEditor is needed.

As expected many users paste in from Word and, because of this, we end up with a lot of <p> elements with just an &nbsp; inside. I'd like to remove these (and empty <p> elements too) for formatting on the front end.

While I use the 'Remove empty paragraph tags' filter on the filter that won't remove those with an &nbsp;. So, I've added this to the template to do that job:
 

$bodyCopy = $page->body;
$bodyCopy = str_replace( "<p>&nbsp;</p>", "", $bodyCopy);
echo $bodyCopy;

This outputs the contents of the body field fine but refuses to get rid of the <p>&nbsp;</p>'s. In addition, since the source within View Source shows them all as "<p> </p>" i.e. with a rendered space character, I have tried replacing that string too but to no avail. They show as "<p>&nbsp;</p>" in the source editor in admin for that page > field by the way.

To investigate further I tried setting the initial variable in the template as actual text like this:

$bodyCopy = "<p>Real Text</p><p>&nbsp;</p><p>More real text.</p>";
$bodyCopy = str_replace( "<p>&nbsp;</p>", "", $bodyCopy);
echo $bodyCopy;

And this works just fine. So it looks like something is happening when I assign the bodyCopy variable to the the field output that's stopping the str_replace working.

Any ideas gratefully received!!!

Link to comment
Share on other sites

For CK Editor fields, I always use the option to force pasting text as plain text. This strips any markup and formatting when pasting, it has prevented so many headaches for me ... It does mean that you can't paste formatted text even if you know what you're doing, but this way no weird formatting from Word can make it into the source code. You can activate this option through the Custom Config Options for the CK Editor field:

forcePasteAsPlainText:  true

I'm not sure that will remove the extra non-breaking space (the paragraphs get added by CK Editor, so you only need to get rid of the NBSP), but it's worth a try and it does deal with most problems when pasting from Word.

  • Like 5
Link to comment
Share on other sites

Hi MoritzLost,

Firstly, thanks for taking the time to respond. I'll give that a try and see what happens for new adverts.

Sadly I have over 400 adverts already in place, many with this issue, so I'll still need to find a template edit that solves the problem for adverts already in place.

Link to comment
Share on other sites

Ok, in that case you can still filter out the unnecessary paragraph with the non-breaking space inside. Your str_replace function probably doesn't work because you are trying to replace the literal string &nbsp;. That won't work because the non-breaking space is (most likely) stored inside the database as a Unicode character, not as the HTML entity. Matching "<p> </p>" won't work either, because that is a regular space, not a non-breaking space.

To replace the non-breaking space, you can use a regular expression and match the specific NO-BREAK SPACE Unicode character. This works for me:

$string = "<p>&nbsp;</p>";
$string_unicode = html_entity_decode($string);
// $string_unicode now contains a non-breaking space unicode character

echo preg_replace('/<p>\x{00A0}<\/p>/u', 'Successfully replaced!', $string_unicode);
// Successfully replaced!

Make sure to include the u flag to treat the string as UTF-8. You could also modify this to match multiple non-breaking spaces or whatever CK Editor throws at you ?

You can use that either to remove the superfluous paragraph in your template output, or write a little script to convert the text saved in the database (which is the cleaner way to go, if you ask me).

  • Like 3
Link to comment
Share on other sites

Thank you!

That worked perfectly. I new I was missing something!
Just for completeness here's my resulting code for now:

// Prepare and output body text (cleans <p> tags with &nbsp; inside)
$bodyCopy = preg_replace('/<p>\x{00A0}<\/p>/u', '', $page->body);
echo $bodyCopy;

 

  • Like 2
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...