Jump to content

encoded comma in URL field


Recommended Posts

This is a strange issue, there is a URL that has an encoded comma in it (%2c) and if you paste this URL into a URL field (standard field, or a URL field within a profields Table), that is converted to a comma. The issue is that the URL no longer works, so you can't use something like the AOS URL checker, and on the output of the URL, it won't work. I had to do a string replace of the comma back to the %2c to make the links on the front end work; But seems like a hack, and wondering what the issue is and if there is a standard way to handle it.

The URL in question is this:

https://www.carlfischer.com/o4361-masters+of+our+day%2c+volume+i.html

Link to comment
Share on other sites

The issue occurs because both InputfieldURL and FieldtypeURL put the value through $sanitizer->url(), and by default the "convertEncoded" option is true.

Quote

convertEncoded (boolean): Convert most encoded hex characters characters (i.e. “%2F”) to non-encoded? (default=true)

It would be good if this option was configurable for InputfieldURL/FieldtypeURL. As a quick fix you could copy the modules to /site/modules/ and edit usages of $santizer->url() to set convertEncoded false, and maybe open a GitHub issue/request to have that added as a configurable option.  

  • Like 2
Link to comment
Share on other sites

Wouldn't it be better to avoid commas in URLs? According to the URI RFC, commas are reserved characters. They are allowed for filenames in URLs.

I haven't tried it, but you should be able to configure conversion of comma to dash (or other characters) in module InputfieldPageName to avoid them in the first place.

Configurable option for convertEncoded in the InputfieldURL module also seems hacky to me.

Link to comment
Share on other sites

@gebeer

My assumption is that most people use URL fields to store external URLs. So we don't have any way of preventing these external sites from using commas or encoded commas in their URLs;

If the URL field is supposed to store this external URL, then it should not modify the URL because then the link to that external site won't work, so the field is kind of useless at that point. The only reason i can think of for sanitizing that %2c would be for security of the database; but in that case there should be some way of reconstructing the original URL so that it doesn't result in a 404 to the original URL.

Link to comment
Share on other sites

4 hours ago, Macrura said:

@gebeer

My assumption is that most people use URL fields to store external URLs. So we don't have any way of preventing these external sites from using commas or encoded commas in their URLs;

If the URL field is supposed to store this external URL, then it should not modify the URL because then the link to that external site won't work, so the field is kind of useless at that point. The only reason i can think of for sanitizing that %2c would be for security of the database; but in that case there should be some way of reconstructing the original URL so that it doesn't result in a 404 to the original URL.

Absolutely makes sense.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...