Jump to content

CSV page import module issue


Soma
 Share

Recommended Posts

Ryan, if I import pages with special chars ,that usually get converted to page name correctly when creating new page, they get lost as the page name sanitizer doesn't take those of the page name module config into account.

I think this has been mentioned elsewhere but can't search for "csv" on the forum (>3 chars limit)

How would you see this possible without having 2 places? Put the config that's currently in the page name module into a "site wide" config? Or is it possible to have the sanitizer take those from the page name?

Link to comment
Share on other sites

The replacements defined with the InputfieldPageName module are just used in that live javascript conversion. Whereas $sanitizer->pageName uses PHP's iconv() to perform a UTF-8 to ASCII conversion (when the $beautify param is set to true). When the $beautify param is omitted, then it just removes anything invalid, converting to a dash.

They don't use the same method because the pageName() in Sanitizer needs to be predictable so that we know it's always going to return the same thing on any installation and at any time. Plus, it needs to be really fast, because it's potentially called hundreds or thousands of times per request. Whereas the one in InputfieldPageName is done as a live translation that you can observe as it does it (and thus can be configurable), and it's okay for it to be slow with lots of RegExps.

The consistent behavior with pageName() is actually kind of important to the CSV import module, as the page names are used as a primary key and become a means of determining if something should be updated or created new. Though if you'd never imported pages before, and you weren't ever going to change the InputfieldPageName translation string further, then it wouldn't matter.

It sounds like in this case, we need another PHP-based method of converting page names that also does translation, like the JS one. Then you could modify the CSV import module to call upon that rather than $sanitizer->pageName. I'm thinking the PHP based method should probably be defined directly in the InputfieldPageName module, since that module owns the custom page name translation settings. Once in place, we could still make it accessible through Sanitizer perhaps through a new pageNameTranslated function, or something like that.

Link to comment
Share on other sites

That would be great ryan.

Since almost any project we do ie multilingual, or german, this gets me everytime.

Also when doing api generation pagename sanitizer isnt usable because of that.

Having alternative function would be welcome.

Link to comment
Share on other sites

Soma, I've added this capability to Sanitizer. To use it, specify Sanitizer::translate as the second param to $sanitizer->pageName(), i.e.

$name = $sanitizer->pageName($name, Sanitizer::translate); 

To make it work with the ImportPagesCSV module, edit the module file. Search for this:

if($name == 'title') $page->name = $this->sanitizer->pageName($value); 

and replace it with this:

if($name == 'title') $page->name = $this->sanitizer->pageName($value, Sanitizer::translate); 

I'm going to make the same change in the module itself here, but need to think a little more about how to avoid messing up anyone that's already using it. If someone is already using the module, then upgrades, it's feasible this could cause it to create new pages when it should be updating existing they had imported in a past version. I'll probably make it a config option or something.

  • Like 2
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...