I was wondering how to sort alphabetically a list of Spanish words [with accents].
Excerpt from the word list:
Chocó
Cundinamarca
Córdoba
I was wondering how to sort alphabetically a list of Spanish words [with accents].
Excerpt from the word list:
Chocó
Cundinamarca
Córdoba
Cygwin uses GNU utilities, which are usually well-behaved when it comes to locales - a notable and regrettable exception is awk
(gawk
)ref.
The following is based on Cygwin 1.7.31-3, current as of this writing.
en_ES
, i.e., Spain's locale. The only way to change that is to explicitly override the default - see below.LANG
(see below; for an overview of all methods, see https://superuser.com/a/271423/139307)To see what locale is in effect in Cygwin, run locale
and inspect the value of the LANG
variable.
If that doesn't show es_*.utf8
(where *
represents your region in the Spanish-speaking world, e.g., CO
for Colombia, ES
for Spain, ...), set the locale as follows:
Edit environment variables for your account
, which opens the Environment Variables
dialog.LANG
with the desired locale, e.g., es_CO.utf8
-- UTF-8 character encoding is usually the best choice.Any Cygwin bash shell you open from the on should reflect the new locale - verify by running locale
and ensuring that the LC_*
values match the LANG
value and that no warnings are reported.
At that point, the following:
sort <<<$'Chocó\nCundinamarca\nCórdoba'
should produce (i.e., ó
will sort directly after o
, as desired):
Chocó
Córdoba
Cundinamarca
Note: locale en_US.utf8
would produce the same output - apparently, it generically sorts accented characters directly after their base characters - which may or may not be what a specific non-US locale actually does.