27

I created a PHP script that checks the HTTP_ACCEPT_LANGUAGE and loads the website using the appropriate language from the 1st two characters:

          $http_lang = substr($_SERVER["HTTP_ACCEPT_LANGUAGE"],0,2);
      switch ($http_lang) {
        case 'en':
          $SESSION->conf['language'] = 'english';
          break;
        case 'es':
          $SESSION->conf['language'] = 'spanish';
          break;
        default:
          $SESSION->conf['language'] = $PREFS->conf['languages'][$SESSION->conf['language_id']];
      }

If I change the language to Spanish in Firefox the website loads in Spanish fine. However I have had several reports that people in Colombia see the website in english.

Details: "es-co" LCID = 9226 Spanish(Colombia)

Anyone have any ideas as to why this is happening? I thought this was the best way to check what language users support.

Alexander O'Mara
  • 58,688
  • 18
  • 163
  • 171
Beanie
  • 425
  • 1
  • 5
  • 11
  • 2
    The best way is to log IPs and their headers. And examine that logs later – zerkms May 17 '11 at 23:20
  • 4
    possible duplicate of [How to get the language value from $_SERVER\['HTTP_ACCEPT_LANGUAGE'\] using PHP?](http://stackoverflow.com/questions/2316476/how-to-get-the-language-value-from-serverhttp-accept-language-using-php) – AJ. May 17 '11 at 23:23
  • It could be a case problem? Changing it to switch(strtolower($http_lang)) might help. Not sure though. – mjec May 20 '11 at 18:28
  • 2
    This is hideously flawed. The header provides a list of possibilities, which can have q values. Taking the first one regardless of quality is a terrible idea. Get a proper parser for it. – Quentin May 27 '11 at 21:32
  • Specs are here: [14.4 Accept-Language (RFC 2616 - Hypertext Transfer Protocol -- HTTP/1.1)](http://tools.ietf.org/html/rfc2616#section-14.4) – hakre Aug 27 '13 at 08:07
  • Can you use http_negotiate_language function? http://php.net/manual/en/function.http-negotiate-language.php – ptkoz Jun 29 '15 at 11:07
  • I am using the [LanguageNegotiator](https://github.com/willdurand/Negotiation#language-negotiation). – Daniel W. Oct 20 '19 at 17:45

7 Answers7

16

A more contemporary method would be to use http_negotiate_language():

 $map = array("en" => "english", "es" => "spanish");
 $conf_language= $map[ http_negotiate_language(array_keys($map)) ];

If you don't have the http extension installed (and not the intl one as well), there is yet another workaround in the comments (user-note #86787 (Nov 2008; by Anonymous)):

<?php 
/* 
  determine which language out of an available set the user prefers most 

  $available_languages        array with language-tag-strings (must be lowercase) that are available 
  $http_accept_language    a HTTP_ACCEPT_LANGUAGE string (read from $_SERVER['HTTP_ACCEPT_LANGUAGE'] if left out) 
*/ 
function prefered_language ($available_languages,$http_accept_language="auto") { 
    // if $http_accept_language was left out, read it from the HTTP-Header 
    if ($http_accept_language == "auto") $http_accept_language = isset($_SERVER['HTTP_ACCEPT_LANGUAGE']) ? $_SERVER['HTTP_ACCEPT_LANGUAGE'] : ''; 

    // standard  for HTTP_ACCEPT_LANGUAGE is defined under 
    // http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4 
    // pattern to find is therefore something like this: 
    //    1#( language-range [ ";" "q" "=" qvalue ] ) 
    // where: 
    //    language-range  = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) | "*" ) 
    //    qvalue         = ( "0" [ "." 0*3DIGIT ] ) 
    //            | ( "1" [ "." 0*3("0") ] ) 
    preg_match_all("/([[:alpha:]]{1,8})(-([[:alpha:]|-]{1,8}))?" . 
                   "(\s*;\s*q\s*=\s*(1\.0{0,3}|0\.\d{0,3}))?\s*(,|$)/i", 
                   $http_accept_language, $hits, PREG_SET_ORDER); 

    // default language (in case of no hits) is the first in the array 
    $bestlang = $available_languages[0]; 
    $bestqval = 0; 

    foreach ($hits as $arr) { 
        // read data from the array of this hit 
        $langprefix = strtolower ($arr[1]); 
        if (!empty($arr[3])) { 
            $langrange = strtolower ($arr[3]); 
            $language = $langprefix . "-" . $langrange; 
        } 
        else $language = $langprefix; 
        $qvalue = 1.0; 
        if (!empty($arr[5])) $qvalue = floatval($arr[5]); 

        // find q-maximal language  
        if (in_array($language,$available_languages) && ($qvalue > $bestqval)) { 
            $bestlang = $language; 
            $bestqval = $qvalue; 
        } 
        // if no direct hit, try the prefix only but decrease q-value by 10% (as http_negotiate_language does) 
        else if (in_array($langprefix,$available_languages) && (($qvalue*0.9) > $bestqval)) { 
            $bestlang = $langprefix; 
            $bestqval = $qvalue*0.9; 
        } 
    } 
    return $bestlang; 
} 
?>
Community
  • 1
  • 1
mario
  • 144,265
  • 20
  • 237
  • 291
  • To get the code example to work I had to replace `$langrange = strtolower ($arr[3]); ` with `$langrange = $arr[3]; `. Note that my lang strings are in the format es-CO rather than es-co as in the question. – Andrew Downes Feb 12 '15 at 23:31
  • Anyone knows what happened to `http_negotiate_language`? Can't find it on php.net... – brasofilo Nov 09 '17 at 16:38
  • @brasofilo I think it's been fully retired for the http v3.x.x pecl extension: https://mdref.m6w6.name/http/Header/negotiate → entirely new API, and isn't included in the official PHP docs. – mario Nov 09 '17 at 22:32
10

I used the regex from @GabrielAnderson and devised this function which behaves according to RFC 2616 (when no quality value is given to a language, it defaults to 1).

When several languages share the same quality value, the most specific are given priority over the less specific ones. (this behaviour is not part of the RFC which provides no recommendation for this specific case)

function Get_Client_Prefered_Language ($getSortedList = false, $acceptedLanguages = false)
{

    if (empty($acceptedLanguages))
        $acceptedLanguages = $_SERVER["HTTP_ACCEPT_LANGUAGE"];

        // regex inspired from @GabrielAnderson on http://stackoverflow.com/questions/6038236/http-accept-language
    preg_match_all('/([a-z]{1,8}(-[a-z]{1,8})*)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i', $acceptedLanguages, $lang_parse);
    $langs = $lang_parse[1];
    $ranks = $lang_parse[4];


        // (create an associative array 'language' => 'preference')
    $lang2pref = array();
    for($i=0; $i<count($langs); $i++)
        $lang2pref[$langs[$i]] = (float) (!empty($ranks[$i]) ? $ranks[$i] : 1);

        // (comparison function for uksort)
    $cmpLangs = function ($a, $b) use ($lang2pref) {
        if ($lang2pref[$a] > $lang2pref[$b])
            return -1;
        elseif ($lang2pref[$a] < $lang2pref[$b])
            return 1;
        elseif (strlen($a) > strlen($b))
            return -1;
        elseif (strlen($a) < strlen($b))
            return 1;
        else
            return 0;
    };

        // sort the languages by prefered language and by the most specific region
    uksort($lang2pref, $cmpLangs);

    if ($getSortedList)
        return $lang2pref;

        // return the first value's key
    reset($lang2pref);
    return key($lang2pref);
}

Example:

print_r(Get_Client_Prefered_Language(true, 'en,en-US,en-AU;q=0.8,fr;q=0.6,en-GB;q=0.4'));

Outputs:

Array
    (
        [en-US] => 1
        [en] => 1
        [en-AU] => 0.8
        [fr] => 0.6
        [en-GB] => 0.4
    )

As you can notice, 'en-US' appears in first position despite the fact that 'en' was first in the given string.

So you could use this function and just replace your first line of code by:

$http_lang = substr(Get_Client_Prefered_Language(),0,2);
2072
  • 275
  • 3
  • 6
  • I might be mistaken, but shouldn't the while-statement in the $getRank function be an if-statement? – Ruben Vreeken Dec 03 '13 at 16:03
  • You are right, I don't remember why I used `while` instead of `if`. It might have been to emphasize that the recursion would continue while `$ranks[$j]` is defined... – 2072 Dec 14 '13 at 23:28
  • Really handy function. Thanks Peter. – Frank Jan 09 '14 at 09:48
  • It's a nice function, but does not work as intended. Let's say the input above is prefixed with 'de', i.e. 'de,en,en-US,...'. Clearly 'de' is now the most preferred language, but it it ranked only 0.8, and therefore is not sorted as the first array element. – Lauri Nurmi Feb 27 '14 at 12:27
  • @LauriNurmi: This is the expected behaviour: fully specified locals (en-US, en-AU) have a higher priority over less specific ones. If you want 'de' to be the most preferred language you should use a string like: 'de;q=1,en,en-US,en-AU;q=0.8,fr;q=0.6,en-GB;q=0.4' – 2072 Feb 27 '14 at 16:36
  • 1
    @LauriNurmi: I've checked with the RFC and indeed, when 'q' is not specified it should default to 1. (I've fixed my answer to respect the RFC). Now the problem is that the RFC has no recommendation when several languages share the same value, they are deemed to be equally acceptable. The order in which they appear in the string does not matter apparently... – 2072 Feb 27 '14 at 18:05
  • @2072: Thanks for the quick fix. Admittedly I haven't studied the RFC very closely. Even if it doesn't specify a significance for the order of languages, in practice browsers represent language preference selection as list where order matters, and the list is converted into an Accept-Language header in the same order. – Lauri Nurmi Feb 28 '14 at 09:01
  • This is the kind of answers i am looking for and should be marked as 'regulated' – Louis Loudog Trottier Feb 04 '17 at 09:24
8

Do you know if this is happening for all visitors to your site from Colombia? Users are usually free to alter the language settings of their browsers — or to have them altered for them by whoever is in charge of the computer. As zerkms recommends, try logging IP addresses and their headers.

If you have the intl extension installed you can use Locale::lookup and Locale::acceptFromHttp to get a best-fit choice of language from the users browser settings and a list of what translations you have available.

Locale::acceptFromHttp($_SERVER['HTTP_ACCEPT_LANGUAGE']); # e.g. "en_US"
hakre
  • 193,403
  • 52
  • 435
  • 836
Nev Stokes
  • 9,051
  • 5
  • 42
  • 44
3

I will use full locale code to refer language, because like zh-TW and zh-CN is 2 different language.

function httpAcceptLanguage($httpAcceptLanguage = null)
{
    if ($httpAcceptLanguage == null) {
        $httpAcceptLanguage = $_SERVER['HTTP_ACCEPT_LANGUAGE'];
    }

    $languages = explode(',', $httpAcceptLanguage);
    $result = array();
    foreach ($languages as $language) {
        $lang = explode(';q=', $language);
        // $lang == [language, weight], default weight = 1
        $result[$lang[0]] = isset($lang[1]) ? floatval($lang[1]) : 1;
    }

    arsort($result);
    return $result;
}

// zh-TW,en-US;q=0.7,en;q=0.3
echo $_SERVER['HTTP_ACCEPT_LANGUAGE'];
/*
    Array
    (
        [zh-TW] => 1
        [en-US] => 0.7
        [en] => 0.3
    )
 */
print_r(httpAcceptLanguage());
Steely Wing
  • 16,239
  • 8
  • 58
  • 54
3

In the end I went with this solution:

if (isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
  preg_match_all('/([a-z]{1,8}(-[a-z]{1,8})?)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i', $_SERVER['HTTP_ACCEPT_LANGUAGE'], $lang_parse);
  if (count($lang_parse[1])){
    $langs = array_combine($lang_parse[1], $lang_parse[4]);
    foreach ($langs as $lang => $val){
      if ($val === '') $langs[$lang] = 1;
    }
    arsort($langs, SORT_NUMERIC);
  }
  foreach ($langs as $lang => $val){
    if (strpos($lang,'en')===0){
      $language = 'english';
      break;
    } else if (strpos($lang,'es')===0){
      $language = 'spanish';
    }
  }
}

I would like to thank AJ for the links. Also thanks to all that replied.

GSerg
  • 76,472
  • 17
  • 159
  • 346
Beanie
  • 425
  • 1
  • 5
  • 11
1

if you want to store languages in array, i do this:

preg_match_all('/([a-z]{1,8}(-[a-z]{1,8})?)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i',  'pt-br,pt;q=0.8,en-us;q=0.5,en,en-uk;q=0.3', $lang_parse);
$langs = $lang_parse[1];
$rank = $lang_parse[4];
for($i=0; $i<count($langs); $i++){
    if ($rank[$i] == NULL) $rank[$i] = $rank[$i+1];
}

this output an array to languages e other with values

preg_match_all('/([a-z]{1,8}(-[a-z]{1,8})?)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/i', 'pt-br,pt;q=0.8,en-us;q=0.5,en,en-uk;q=0.3', $lang_parse);
$langs = $lang_parse[1];
$rank = $lang_parse[4];
$lang = array();
for($i=0; $i<count($langs); $i++){
    $lang[$langs[$i]] = ($rank[$i] == NULL) ? $rank[$i+1] : $rank[$i];
}

this output an array like this:

Array
(
    [pt-br] => 0.8
    [pt] => 0.8
    [en-us] => 0.5
    [en] => 0.3
    [en-uk] => 0.3
)
Gabriel Anderson
  • 1,304
  • 14
  • 17
0

I put my trust in the skilled programmers who work for PHP and think ahead. Here is my version of a label for the Google translator drop down.

function gethttplanguage(){
    $langs = array(     
            'en',// default
            'it',
            'dn',
            'fr',
            'es'         
    );
    $questions = array(
    "en" => "If you wish to see this site in another language click here",
    "it" => "Se vuole vedere questo sito in italiano clicca qui",
    "dn" => "Hvis du ønsker at se denne hjemmeside i danske klik her",
    "fr" => "Si vous voulez visualiser ce site en français, cliquez ici",
    "es" => "Si quieres ver este sitio en español haga clic aquí"
    );
    $result = array();  
    http_negotiate_language($langs, &$result);  
    return $questions[key($result)];
}
brasofilo
  • 25,496
  • 15
  • 91
  • 179
user462990
  • 5,472
  • 3
  • 33
  • 35