Jump to content

Converting HTML entity characters


FrancisChung
 Share

Recommended Posts

Hi there,

Bit of a rookie question. It appears that one of my users tried to store some Markup in a Textarea field for some SEO Text, but it was stored as HTML Entity chars (< >) 

I checked the field type details (TextArea, No Formatters, Content-Type:Unknown) and tried a few things (TinyMCE, Markup/HTML Type) to no avail.

I was wondering if there's a quick fix for this as I'm almost resigned to writing a on-the-fly formatter to fix this.

I'm hoping some one can point me to a more CMS centric solution?

Link to comment
Share on other sites

Ideally storing them as <html> tags instead of html entity encoded characters as they currently are.

Perhaps it's stored that way as some sort of security measure?

If so, is there a built-in function or something in the backend to read them as decoded HTML characters so the text gets formatted by the browser?

Link to comment
Share on other sites

Have you actually been able to reproduce the problem?

I'm asking this because it's not clear to me if it was the user who posted the data like that, or if that's how the data gets stored in the database, or if you mean that when you try to print the value of the field in your program, it comes out as < instead of < on your screen. It's not even clear if the user who posted the data used a custom form or the PW admin.

To answer your question, there is of course http://php.net/manual/en/function.html-entity-decode.php, which can also be used through PW like this

$sanitizer->unentities($text);

However I don't think it's the right approach - to me it sounds like you have a problem elsewhere. Maybe you are encoding the value twice. Could you please post us some bits of code which demonstrates your problem? I'm quite sure we can then easily help you fix your problem. It would also help if you could paste the raw value of the field directly from the database.

  • Like 3
Link to comment
Share on other sites

The user was copying and pasting contents of a MS Word document into CMS via the Processwire Admin.

The field we're having is a SEO Text field which is setup as a Textarea (TinyMCE).

A sample of how the data looks like when viewed in the  Processwire Admin.

I believe this is the same format as the original MS Word document.

<H2>

„Wer kriecht aus dem Schneckenhaus?“ und viele weitere Sprachspiele für Dich

</H2>

 

I believe this is stored in the DB raw as 

<p><H2></p>

<p>„Wer kriecht aus dem Schneckenhaus?“ und viele weitere Sprachspiele für Dich</p>

<p></H2></p>

Perhaps the field type is wrong and there are more appropriate types?

I agree completely that I shouldn't have to do any postprocessing on the fields, which is why I came to the forum and asked for some guidance.

Thanks for the info about decoding HTML characters. Not sure why my Google-Fu was not up to scratch that day.

Link to comment
Share on other sites

@FrancisChung, perhaps I'm misunderstanding this, but I think you are saying they are pasting markup into an RTE? That could never work under any circumstances, because RTE means rich text, not markup. If they want to paste in markup, they would have to first click the code view button.

  • Like 1
Link to comment
Share on other sites

Hi @Macrura, so what you're saying is that TinyMCE is a rich text editor field and not a HTML field editor (by default)? So a user would need to press the HTML button first before entering the markup?

I assume the code view button you mentioned is the HTML button on the TinyMCE?

Link to comment
Share on other sites

I may be misunderstanding, but why do you even use TinyMCE, which is by definition a tool to create html content. Seo fields do not support any html-tags, but only raw texts. Just use a simple textarea without any rich text gimmics, maybe add the strip tags settings and you're good to go.

Hi @Macrura, so what you're saying is that TinyMCE is a rich text editor field and not a HTML field editor (by default)? So a user would need to press the HTML button first before entering the markup?

I assume the code view button you mentioned is the HTML button on the TinyMCE?

That's right. 

Link to comment
Share on other sites

The text supplied by a SEO consultant came with HTML markup, so the users wanted to put it in as is with the markup preserved.

So would a text area been suffice? Would that show the HTML markup assuming I don't put the string tags settings in?

FYI The SEO field previously was plain text without markup, but recently changed when they hired a SEO consultant.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...