2

pspell

Is there a way to get an array of dictionaries that pspell supports, preferably with their full human-readble names, using the PHP API?

For the time being I'm doing it this way:

$dicts = explode(PHP_EOL,rtrim(`aspell dicts`));

But that still doesn't give me the human-friendly version, (e.g., en_CA might be "English - Canadian")


Solution

Here's what I came up with:

$dicts = explode(PHP_EOL,rtrim(`aspell dicts`));
$ltr = array();
foreach($dicts as $dict) {
    if(preg_match('`^([a-z]+)$`',$dict,$m)) {
        $lang = _iso639_1_to_lang_name($dict) ?: $dict;
    } elseif(preg_match('`^([a-z]+)_([A-Z]+)$`',$dict,$m)) {
        $lang = (_iso639_1_to_lang_name($m[1]) ?: $m[1]).' - '.(_country_code_to_adjectival($m[2])?:$m[2]);
    } elseif(preg_match('`^([a-z]+)_([A-Z]+)-(.*)$`',$dict,$m)) {
        $lang = (_iso639_1_to_lang_name($m[1]) ?: $m[1]).' - '.(_country_code_to_adjectival($m[2])?:$m[2]).' - '.ucwords(str_replace('_',' ',$m[3]));
    } elseif(preg_match('`^([a-z]+)-(.*)$`',$dict,$m)) {
        $lang = (_iso639_1_to_lang_name($m[1]) ?: $m[1]).' - '.ucwords(str_replace('_',' ',$m[2]));
    } else {
        $lang = $dict;
    }
    $ltr[$dict] = $lang;
}

Those functions just look up the full names from ISO codes based on data I scraped from Wikipedia.

Community
  • 1
  • 1
mpen
  • 272,448
  • 266
  • 850
  • 1,236

2 Answers2

1

Do you mean find out which dictionaries you have on your system? pspell is just a very thin wrapper over the GNU aspell library. It supports whatever the aspell library supports, but unfortunately it does not provide a way to list all the dictionaries. It would be nice if that functionality was added.

Sherif
  • 11,786
  • 3
  • 32
  • 57
  • Yes; what I have on my system. Can get the human-friendly names command-line then? Or do those even exist? – mpen Dec 10 '12 at 21:38
  • I'm not sure that they do. The dictionaries are named that way by convention. The two letter lower-case language, followed by the two letter upper-case dialect with optional subcategories for things like *"w_accents" -- with accents, "wo_accents" -- without accents, etc...* – Sherif Dec 10 '12 at 21:44
  • > By convention the language name should be the two letter ISO 639 language code if it exists, if not use the three letter code. Here's the link to the aspell manual http://aspell.net/man-html/The-Language-Data-File.html this might help. – Sherif Dec 10 '12 at 21:48
  • Thanks. I wrote some code to split the dictionary names apart then created some lookups to convert it into the format I want. – mpen Dec 10 '12 at 23:26
0

You have to use the locale_parse and its details from PHP documentaion:

http://php.net/manual/en/locale.parselocale.php

SaidbakR
  • 13,303
  • 20
  • 101
  • 195
  • While that is not a bad idea, it actually doesn't do what the OP wants. They want the full-English name of the language (*if I understood them correctly*). `locale_parse` will not provide this information. It would simply return the same ISO information, just as an array. The OP is asking to get an array of dictionaries not an array of parsed locales. – Sherif Dec 10 '12 at 21:53
  • Well, I understood that, $dicts = explode(PHP_EOL,rtrim(`aspell dicts`)), returns a list (array) but it is not formatted or it is not in a suitable format for his needs, so using locale_parse may help him to change this array or reformatting it. Do you agree with me? @GoogleGuy – SaidbakR Dec 10 '12 at 21:57
  • 1
    No, because it doesn't do what the OP wants. They aren't trying to format the locale. They want the full-English name of the language. For example they want `en_CA` translated to `English Canadian`. That is not what locale_parse does. – Sherif Dec 10 '12 at 22:17
  • Ok, Now I don't know if what I regarded may be useful for the OP or ultimately not! – SaidbakR Dec 10 '12 at 22:20
  • @sємsєм: I don't have PECL installed and looking at the docs, it doesn't look like that output is helpful to me. Seems to just spit out more codes, not human-friendly words. – mpen Dec 10 '12 at 23:26