1

I have the following code, when ever label has a unicode string, annotate fails throwing error, how do i resolve this?

from matplotlib import pyplot as plt
import numpy as Math


X = Math.genfromtxt(inputFile,autostrip=True,comments=None,dtype=Math.float64,usecols=(range(1,dim+1)))
labels = Math.genfromtxt(inputFile,autostrip=True,comments=None,dtype='str',usecols=(0))
Y = some_function(X, 2, 50, 20.0);    
fig = plt.figure()
ax = fig.add_subplot(111)
plt.scatter(Y[:,0],Y[:,1])
for l,x,y in zip(labels,Y[:,0],Y[:,1]):
   ax.annotate('(%s)' %l, xy=(x,y), textcoords='offset points')

plt.grid()
plt.show()

Error :
Traceback (most recent call last):
ax.annotate('(%s)' %unicode(l), xy=(x,y), textcoords='offset points')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4: ordinal not in range(128)
Lanc
  • 880
  • 12
  • 25

1 Answers1

2

You need to decode the string as unicode rather than standard ASCII (see here):

from matplotlib import pyplot as plt

l = '\xe2'

plt.annotate('%s' % l, (0, 0))
# raises UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)

plt.annotate('%s' % l.decode('unicode-escape'), (0, 0))
# works

You could also decode your input file as unicode like this:

# converter function that decodes a string as unicode
conv = {0:(lambda s: s.decode('unicode-escape'))}

labels = np.genfromtxt(inputFile, dtype='unicode', converters=conv, usecols=0)

labels.dtype will then be unicode ('<Ux') instead of string ('|Sx') and therefore ax.annotate('(%s)' %l, ...) will work.

Community
  • 1
  • 1
ali_m
  • 71,714
  • 23
  • 223
  • 298
  • Thanks! First option works for me, but the second doesn't work – Lanc Feb 11 '15 at 16:54
  • When you say it "doesn't work", could you be more specific? Does the error occur in `genfromtxt` or in `annotate`? It would be helpful if you could show what your `inputFile` looks like in your question. – ali_m Feb 11 '15 at 16:59
  • I figured out the problem - it's necessary to pass a converter argument to `genfromtxt` which decodes the input string as unicode. I've updated my answer to include this. – ali_m Feb 11 '15 at 17:31