Not (a-z) chars support in PageAutocomplete or in "Add New" Page inputfield
#1
Posted 26 May 2012 - 12:51 PM
I am trying to create a tags functionality for hotel features, like "pool, sauna, etc", and the new tags to be created automatically when entered for the first time. The specific is that tags titles should to be with cyrillic characters.
I was hoping that PageAutocomplete will be ideal for tags system, but it doesn't support searching with cyrillic chars on keypress
Unfortunately "Create new" feature in Page input field doesn't support saving the items with cyrillic characters in their title, because it cant replace them automatically to their (a-z) equivalents for the "name" field.
1. Is there an easy way to use the PageName character replacement feature when using "Create new" in Page input field? Dont you think it will be great if sanitizer->name support such replacement internally?
2. And something related.. Someday I will ask if PW will support not (a-z) characters in URLs. I know that there are standarts and the cyrillic chars are not included in allowed chars. However... when searching for something in cyrillic, many of the Google results contain cyr characters in their URL. Probably we should to be competitive in SEO point of view and to be allowed to use the not a-z characters in the URL? The same for other specific chars in German and other languages. What do you think?
Thanks
#3
Posted 27 May 2012 - 10:10 AM
Meanwhile I have found that the PageName input field replacement was NOT enabled by default in $sanitizer->pageName(). I have modified the Pages->setupNew() method to enable it and this allowed me to use "Create new" feature with not (a-z) characters.
#4
Posted 29 May 2012 - 10:25 AM
Meanwhile I have found that the PageName input field replacement was NOT enabled by default in $sanitizer->pageName().
Thanks, I will make the same change in the core, replacing the second 'true' param with 'Sanitizer::translate' in the setupNew() function. The translate option was added to the sanitizer pretty recently.
#5
Posted 29 May 2012 - 12:16 PM
I would like to remind about the second part of my post, about using PW with other than allowed (a-z-.) characters in the URLs. It seems that Google prioritize such sites compared to their competitors. Do you think it will be possible in some of the PW future releases and if it will worth the effort?
#6
Posted 31 May 2012 - 10:28 AM
Regarding Google and prioritization, is there any research/documentation that supports the theory that it prioritizes sites using UTF-8 in URLs? I guess that would surprise me, but I always have an open mind.
#7
Posted 31 May 2012 - 03:57 PM
EDIT: I mean I get this:
http://fi.wikipedia.org/wiki/%C3%84%C3%A4kk%C3%B6set
Edited by apeisa, 31 May 2012 - 03:58 PM.
#8
Posted 31 May 2012 - 04:18 PM
As far as I can tell, URIs are all represented in a subset of ASCII characters (see RFC3986) but allow for the embedding of other characters (including unicode characters) by allowing them to be percent encoded into the URI. Browsers understand this and decode URIs to display the correct characters in the address bar and they allow you to enter the unicode when typing the characters in the address, converting them on submission using URL encoding. You can do this yourself in PHP using urlencode() or rawurlencode().
Looks like copy and paste out of chrome is pulling the encoded string out of the address bar.
Edited to add: Just found the relevant part of the article I linked...
The generic URI syntax mandates that new URI schemes that provide for the representation of character data in a URI must, in effect, represent characters from the unreserved set without translation, and should convert all other characters to bytes according to UTF-8, and then percent-encode those values. This requirement was introduced in January 2005 with the publication of RFC 3986. URI schemes introduced before this date are not affected.
Edited by netcarver, 31 May 2012 - 04:24 PM.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users













