0

I have basic code like this I am using to translate a website:

function getTranslataion($db_value)
{
     global $db_array;

     if (array_key_exists(strtolower($db_value), $db_array)) {
          return $db_array[$db_value];
     }
     return $db_value;
}

I also do a reverse lookup an pull up key's based on value:

function findTranslataion($displayed_string)
{
     global $db_array;

     $key = array_search(strtolower($displayed_string), $db_array);
     if (strlen($key) > 0) {
          return $key;
     }
     return $displayed_string;
}

For example, the array may look like this:

$db_array = array ( "Yes" = > "si", "No" => "No );

Now for the word "Si", it can also be typed as "Sí" (with an accent). The user may or may not type "Si" with an accent mark depending on their keyboard, etc. So is there a way to do these type of searches and ignore all types of variations of, for example, the letter "i" and just match it with the array key or value?

TheLettuceMaster
  • 15,594
  • 48
  • 153
  • 259
  • 1
    http://stackoverflow.com/a/3373364/2623144 – XaxD Jul 16 '14 at 21:34
  • Good luck trying to write your own translation software - it's not just word substitution! I feel like you'd be a million times better off to try and harness Google Translate's API here... – scrowler Jul 16 '14 at 21:38
  • @scrowler You're probably right, but we are only translating a small amount here. It's very much a "controlled environment". – TheLettuceMaster Jul 16 '14 at 22:26

1 Answers1

0

Based on the link that was provided above, and some more code, here is what I did:

$unwanted_accents = array(  'Š'=>'S', 'š'=>'s', 'Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
                            'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U',
                            'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss', 'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c',
                            'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o',
                            'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y' );

// Translate DB String TO Default Language
// (DB SELECTS)
function getTranslation($db_value)
{
     global $db_array;

     if (array_key_exists(strtolower($db_value), $db_array)) {
          return $db_array[$db_value];
     }
     return $db_value;
}

// Translate DB String FROM Default Language
// (DB INSERTS / UPDATES)
function findTranslation($displayed_string)
{
     global $db_array;
     global $unwanted_accents;

     $displayed_string = strtolower($displayed_string);

     $key = array_search($displayed_string, $db_array);
     if (strlen($key) == 0) {

          // remove potential accent marks from incoming var
          $displayed_string = strtr( $displayed_string, $unwanted_accents );

          // remove accents from db array
          foreach($db_array as $key => $value) {
              $sanitized_val = strtr( $value, $unwanted_accents ); 
              if ($sanitized_val == $displayed_string) {
                   return $key; // found after all accents stripped
              }              
          }
          return $displayed_string; // if all else fails
     }
     return $key; // found on first attempt    
}
TheLettuceMaster
  • 15,594
  • 48
  • 153
  • 259