I have two correlated ‘why’ —not ‘how to’— questions:
Question 1
While printf
and od
produce same decimal, octal, and hex representations for ASCII characters —
ascii_char=A
printf "%d" "'$ascii_char"
65
echo -n $ascii_char | od -A n -t d1
65
echo -n $ascii_char | od -A n -t u1
65
printf "%o" "'$ascii_char"
101
echo -n $ascii_char | od -A n -t o1
101
printf "%x" "'$ascii_char"
41
echo -n $ascii_char | od -A n -t x1
41
— why do they somehow not produce same representations for a Unicode char?
unicode_char=
printf "%d" "'$unicode_char"
128021
echo -n $unicode_char | od -A n -t d1
-16 -97 -112 -107
echo -n $unicode_char | od -A n -t d
-1785683984
echo -n $unicode_char | od -A n -t u1
240 159 144 149
echo -n $unicode_char | od -A n -t u
2509283312
printf "%o" "'$unicode_char"
372025
echo -n $unicode_char | od -A n -t o1
360 237 220 225
echo -n $unicode_char | od -A n -t o
22544117760
printf "%x" "'$unicode_char"
1f415
echo -n $unicode_char | od -A n -t x1
f0 9f 90 95
echo -n $unicode_char | od -A n -t x
95909ff0
Question 2
While od
results for a Unicode char are different from those of printf
, how come printf
still knows how to convert od
results back to a character — while printf
cannot convert back its own results?
printf "%o" "'$unicode_char"
372025 # printf cannot convert back its own result
echo -n $unicode_char | od -A n -t o1
360 237 220 225 # looks different, but printf can convert it back correctly
printf %b '\360\237\220\225'
# success
printf "%x" "'$unicode_char"
1f415 # printf can convert back this result
printf "\U$(printf %08x 0x1f415)"
# success
echo -n $unicode_char | od -A n -t x1
f0 9f 90 95 # looks different, but printf can convert it back correctly
printf %b '\xf0\x9f\x90\x95'
# success