17

I am using Gina Trapiani's excellent todo.sh to organize my todo-list.

However being a dane, it would be nice if the script accepted special danish characters like ø and æ.

I am an absolute UNIX-n00b, so it would be a great help if anybody could tell me how to fix this! :)

timkl
  • 3,299
  • 12
  • 57
  • 71
  • I've successfully used `todo.sh` with extended characters on Mac OS X.. Which platform are you using? – smokris Jan 10 '10 at 22:04

2 Answers2

21

Slowly, the Unix world is moving from ASCII and other regional encodings to UTF-8. You need to be running a UTF terminal, such as a modern xterm or putty.

In your ~/.bash_profile set you language to be one of the UTF-8 variants.

export LANG=C.UTF-8
or
export LANG=en_AU.UTF-8
etc..

You should then be able to write UTF-8 characters in the terminal, and include them in bash scripts.

#!/bin/bash
echo "UTF-8 is græat ☺"

See also: https://serverfault.com/questions/11015/utf-8-and-shell-scripts

Community
  • 1
  • 1
brianegge
  • 29,240
  • 13
  • 74
  • 99
  • 1
    On a TTY (not xterm), the terminal might not be UTF-8 capable until `unicode_start` is run. (This is unrelated to locale and shell/application support.) Some distributions enable this at boot, but some don't. – ephemient Jan 10 '10 at 22:06
  • Unrelated to OP's question, but posting just for the record. This fixed the issue I had when debugging a Python script with ipdb. It was returning `*** UnicodeEncodeError: 'ascii' codec can't encode character '\u22f1' in position 314: ordinal not in range(12)` every time I tried to print a variable. I tried to set `LANG=en_US.UTF-8`, but only by setting to `export LANG=C.UTF-8` that it worked. – Yamaneko Oct 09 '17 at 22:00
  • I came here exactly for the `ipdb` problem, but while I'm able to correctly print this: `echo "UTF-8 is græat ☺"`, I'm still getting the `UnicodeEncodeError` in `ipdb` :( – Andrea Grandi Oct 10 '19 at 06:20
18

What does this command show?

locale

It should show something like this for you:

LC_CTYPE="da_DK.UTF-8"
LC_NUMERIC="da_DK.UTF-8"
LC_TIME="da_DK.UTF-8"
LC_COLLATE="da_DK.UTF-8"
LC_MONETARY="da_DK.UTF-8"
LC_MESSAGES="da_DK.UTF-8"
LC_PAPER="da_DK.UTF-8"
LC_NAME="da_DK.UTF-8"
LC_ADDRESS="da_DK.UTF-8"
LC_TELEPHONE="da_DK.UTF-8"
LC_MEASUREMENT="da_DK.UTF-8"
LC_IDENTIFICATION="da_DK.UTF-8"
LC_ALL=

If not, you might try doing this before you run your script:

LANG=da_DK.UTF-8

You don't say what happens when you run the script and it encounters these characters. Are they in the todo file? Are they entered at a prompt? Is there an error message? Is something output in place of the expected output?

Try this and see what you get:

read -p "Enter some characters" string
echo "$string"
Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439