Jump to content

Autodetect Language


Pierre-Luc
 Share

Recommended Posts

On GitHubhttps://github.com/plauclair/AutodetectLanguage
PW's Modules repohttp://modules.processwire.com/modules/autodetect-language/
 
This ProcessWire 2.x module tries finds a best match between HTTP_ACCEPT_LANGUAGE and currently installed languages.
 
If a match is found, the user will be redirected from the requested page to the same page in his preferred language. This match occurs only on the first page load, and will work with default caching on. If no match is found, the website will revert back to the "default" language.
 
Installation info and more details on GitHub. Please submit any bug on the bug tracker, as it is easier to track different issues.

  • Like 12
Link to comment
Share on other sites

  • 4 months later...

I just pushed version 1.0.3. This version introduces a new option: "Do not detect language if GET parameter is set". You can now set a GET parameter to prevent language redirection on some requests with this parameter. For example, if this parameter is set to "nodetect" and the user requests "http://example.com/de/?nodetect", the plugin will not redirect the user to the user's preferred language or fall back to default.

  • Like 2
Link to comment
Share on other sites

  • 2 months later...

Hi,

have you tested this great  :) module with ProCache (or default cache only)?

On production I see a strange behaviour (= it does not work, no redirect) but if I clear or disable ProCache it works well.

My default language isn't EN but IT, so it should redirect always to /en/ with a browser with HTTP_ACCEPT_LANGUAGE  != it.

Thanks

Link to comment
Share on other sites

Nope, unfortunately I don't typically use ProCache on my projects. My understanding is that ProCache completely bypasses the typical rendering procedure, and this module hooks to Page::render so they might not be compatible. I would suggest contacting support and see if there's a way to hook into it, I'd be glad to implement this.

Link to comment
Share on other sites

:(

Aside from ProCache.. another little question.

My default system language is it (customer request) but for all request outside it browser I want to fallback to en.

With defaultSiteLanguageCode set to en all the call from it browser are redirected to en (because not match it -> "default").

Does it make sense for you?

- defaultSiteLanguageCode -> it

- fallbackSiteLanguageCode -> en

Or something similar..

Thanks

Link to comment
Share on other sites

And.. add a check for not active language (e.g. fr): so when a language is visible but not active translator can translate pages in backend but the module does not redirect on the /fr/ page.

Link to comment
Share on other sites

  • 1 month later...

@Pierre-Luc, thanks for your module, but I always thought that redirect to a language automatically isn't good and Google seems to support this:

https://support.google.com/webmasters/answer/182192?hl=en

Avoid automatic redirection based on the user’s perceived language. These redirections could prevent users (and search engines) from viewing all the versions of your site.

What do you guys think?

Link to comment
Share on other sites

The problem for google is that the crawler most likely will be redirected everytime it visits your site, depending on how "first visit" detection is implemented. For real users it's bad if you just autodetect the language without letting them change the language afterwards, everything else is no problem.

Link to comment
Share on other sites

Hey Sérgio, I've read that too but after doing some research it's really not as clear cut at StackOverflow might want it to be. :P

As far as I know crawlers don't use Accept-Language typically, and the plugin does absolutely no redirection when it is not set so it should not change their behaviour. This is what Google's documentation says, and others too, and it's also why I implemented the plugin the way it is implemented.

From the "Location-aware crawling" page from Google, you can read this:

If your website has pages that return different content based on the perceived country or preferred language of the visitor (i.e., you have locale-adaptive pages), Google might not crawl, index, or rank all of your locale-adaptive content. This is because the default IP addresses of the Googlebot crawler appear to be based in the USA. In addition, the crawler sends HTTP requests without setting Accept-Language in the request header.

This means that if your website is serving purely based on Accept-Language, there might be trouble cause Google isn't sending any typically, I've seen this happen in the past. The plugin does none of this, it's in your hands to implement a good solution.

What you want to do :

  • DO use <link> with hreflang to help the crawlers find other pages.
  • Don't serve multiple languages from the same URL — always use lang.example.com/.. or example.com/lang/.. It's confusing to users not to and they can't send pages to their friends in the "right" language.
  • Don't use GET parameters to set anything language related.
  • Don't prevent the user (or crawlers) from switching languages and become trapped in a language that isn't their own — aka always have a language switcher in a prominent place and don't detect more than once.
  • Read all the "International" section very carefully and find a solution that works for you. Localization is not an easy topic. 

Note tough that Google tells you what alternative methods supported (tough there's a huge red warning box at the top of the page) :

Currently, Googlebot recognizes a number of signals and hints to determine if your website serves locale-specific content:

  • 1) Serving different content on the same URL—based on the user’s perceived country (geolocation)
  • 2) Serving different content on the same URL—based on the Accept-Language field set by the user’s browser in the HTTP request header
  • 3) Completely blocking access to requests from specific countries

In each case :

1) AutoDetect language isn't a location (geo-ip or else) aware plugin.

2) Discussed earlier, wouldn't recommend for various reasons but might still work. Google says "usually don't".

3) That's the firewall's job! :P

However, if you guys still follow best practices and have trouble being indexed, I'm open to look at more "advanced" solutions. …But more complexity means more points of failure and that's something I usually want to avoid. Just want something that's accessible to beginners and works in the most common cases. :)

  • Like 2
Link to comment
Share on other sites

  • 2 months later...

Hi Pierre-Luc,

first, thank you for this nice module!

I added a new variable 'noBackend' which allows to test if the page requested belongs to the backend. As I configure the language for login user explicitly, I wanted to have a possibility to prevent admin pages from being redirected, without using a GET parameter.

Shall I send you a patch?

Cheers Oliver

Link to comment
Share on other sites

  • 2 months later...

As far as I know crawlers don't use Accept-Language typically,

I had a problem with Wordpress recently. After the Google Update from last week, all translated pages got dropped from the index and their content was replaced with the default EN language content. A mega disaster. A fix was to switch off automatic language redirection (WPML).

I am mentioning this here, because if this module is crafted on the above mentioned assumption that crawlers don't use Accept-Language, it might be interesting whether this is still the case and what the consequences are, if not.

 I am not a SEO specialist but I found this brand new article from 2016:

https://support.google.com/webmasters/answer/6144055?hl=en

If your site alters its content based on any Accept-Language field set by browsers’ HTTP headers, Googlebot uses a variety of signals to try to crawl your content using different Accept-Language HTTP headers. This means Google is more likely to discover, index, and rank your content in the different languages your site supports.

So, what do you think?

Link to comment
Share on other sites

Hard to say without seeing the <head> on the specific pages you're talking about.

Google still says this though: IMPORTANT: We continue to support and recommend using separate locale URL configurations and annotating them with rel=alternate hreflang annotations.

 
Discovering based accept-language is one thing (this targets the users), you should still use the hreflang tags and <html lang="…"> (this targets crawlers). This will give better results as it's not prone to interpretation I believe. I haven't seen anything weird on the sites I use the plugin on yet, but will be monitoring. 
  • Like 1
Link to comment
Share on other sites

  • 1 year later...
  • 1 year later...

running PW3.0.98 and simply cant find a way to default the lang to fr-FR on the front end. Hoped this module would do the magic but can only ever get en-EN on enter to the site in a private session. Guest user lang is fr. Something else I need to be doing?

Link to comment
Share on other sites

I used this way very often in the past @benbyf

<?php

// create /site/modules/LanguageDefault
// create file LanguageDefault.module
// refresh modules in PW admin

// https://processwire-recipes.com/recipes/change-homepages-default-language/

class LanguageDefault extends WireData implements Module {

    /**
     * getModuleInfo is a module required by all modules to tell ProcessWire about them
     *
     * @return array
     *
     */
    public static function getModuleInfo() {

        return array(
            'title' => 'LanguageDefault', 
            'version' => 1, 
            'summary' => 'A work around to changing the default language.',
            'href' => 'https://processwire.com/talk/topic/9322-change-default-language-for-homepage/?p=89717',
            'singular' => true, 
            'autoload' => true, 
        );
    }

    /**
     * Initialize the module
     *
     * ProcessWire calls this when the module is loaded. For 'autoload' modules, this will be called
     * when ProcessWire's API is ready. As a result, this is a good place to attach hooks. 
     *
     */
    public function init() {
        $this->session->addHookBefore('redirect', $this, 'setDefaultLanguage'); 
    }

    public function setDefaultLanguage($event) {
        if ($this->page->id == 1 && $event->arguments(0) == $this->page->localUrl('default')) {
          $event->arguments(0, $this->page->localUrl('de'));
        }
    }

}

Found at: https://processwire-recipes.com/recipes/change-homepages-default-language/

  • Like 2
Link to comment
Share on other sites

Yes... no... maybe... most of my PW sites are in german, so if I get this wrong, it's my fault. Otherwise I missed a thought. ? 

But yes, it would be nice to set a "real default" language in PW and therefore a PW-wise session/redirect.

Nontheless this snippet saves me most of the time if I miss-configured something.

 

Hope this helps in your case.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...