5

When I store string data as characters in a numpy array, and retrieve the values later, it always returns a value as b'x' for 'x' I stored earlier. Currently I am using the stupid way to extract the value by doing str(some_array[row, col...]).lstrip("b'").rstrip("'"), I believe there should be an easier way to do this. Does anyone know? Thanks!

user3833107
  • 51
  • 1
  • 4
  • 3
    It would be useful to include code demonstrating the problem. also, what version of python and numpy? – shx2 Jun 17 '15 at 06:20
  • I found that if I use dtype = 'S', this happens, if I use dtype = str, this doesn't happen. I probably didn't totally comprehend the meaning of these data types... – user3833107 Jun 17 '15 at 06:34

1 Answers1

1

Based on your output you're using Python 3 (where bytes and str are different types). This answer applies to arrays with dtype='S' (a bytestring); for dtype=str or dtype='U' they are stored as unicode strings (at least for python 3) and there is no issue.

The easiest thing to do is probably

str(some_array[row,col...],encoding='ascii')

Note that you can use other encodings instead of ascii ('UTF-8' is common) and which one is right depends on which one you used to put your data into the numpy array. If you're using non-exotic alphanumeric characters it shouldn't really matter though. If you've put an str into the array, numpy uses ascii to encode it, so unless you've gone to special effort to do something different, 'ascii' should be right.

(For reference, I found this answer helpful https://stackoverflow.com/a/14010552/4657412)

Community
  • 1
  • 1
DavidW
  • 29,336
  • 6
  • 55
  • 86