Here is a POSIX[1] shell script that can print the code point and the character in a nice and easy way with the help of fc-match
which is mentioned in Neil Mayhew's answer (it can even handle up to 8-hex-digit Unicode):
#!/bin/bash
for range in $(fc-match --format='%{charset}\n' "$1"); do
for n in $(seq "0x${range%-*}" "0x${range#*-}"); do
n_hex=$(printf "%04x" "$n")
# using \U for 5-hex-digits
printf "%-5s\U$n_hex\t" "$n_hex"
count=$((count + 1))
if [ $((count % 10)) = 0 ]; then
printf "\n"
fi
done
done
printf "\n"
You can pass the font name or anything that fc-match
accepts:
$ ls-chars "DejaVu Sans"
Updated content:
I learned that subshell is very time consuming (the printf
subshell in my script). So I managed to write a improved version that is 5-10 times faster!
#!/bin/bash
for range in $(fc-match --format='%{charset}\n' "$1"); do
for n in $(seq "0x${range%-*}" "0x${range#*-}"); do
printf "%04x\n" "$n"
done
done | while read -r n_hex; do
count=$((count + 1))
printf "%-5s\U$n_hex\t" "$n_hex"
[ $((count % 10)) = 0 ] && printf "\n"
done
printf "\n"
Old version:
$ time ls-chars "DejaVu Sans" | wc
592 11269 52740
real 0m2.876s
user 0m2.203s
sys 0m0.888s
New version (the line number indicates 5910+ characters, in 0.4 seconds!):
$ time ls-chars "DejaVu Sans" | wc
592 11269 52740
real 0m0.399s
user 0m0.446s
sys 0m0.120s
End of update
Sample output (it aligns better in my st terminal ):
0020 0021 ! 0022 " 0023 # 0024 $ 0025 % 0026 & 0027 ' 0028 ( 0029 )
002a * 002b + 002c , 002d - 002e . 002f / 0030 0 0031 1 0032 2 0033 3
0034 4 0035 5 0036 6 0037 7 0038 8 0039 9 003a : 003b ; 003c < 003d =
003e > 003f ? 0040 @ 0041 A 0042 B 0043 C 0044 D 0045 E 0046 F 0047 G
...
1f61a 1f61b 1f61c 1f61d 1f61e 1f61f 1f620 1f621 1f622 1f623
1f625 1f626 1f627 1f628 1f629 1f62a 1f62b 1f62d 1f62e 1f62f
1f630 1f631 1f632 1f633 1f634 1f635 1f636 1f637 1f638 1f639
1f63a 1f63b 1f63c 1f63d 1f63e 1f63f 1f640 1f643
[1] Seems \U
in printf
is not POSIX standard?