1

I'm pulling in data using urllib (Python 3) that arrives looking like this:

b'\n  9 27  70.40 43.40  0.00  15.90   3218.5    \n  9 28  74.90 43.70  0.00  18.30   3236.8'

Converting this to string it ends up like this:

"\\n  9 27  70.40 43.40  0.00  15.90   3218.5    \\n  9 28  74.90 43.70  0.00  18.30   3236.8"   

I'd like to use numpy.genfromtxt to build an array, but I can't get io.StringIO to parse the newline characters. When I use:

table = io.StringIO(match.group(1), newline=r"\\n")  # or newline=r"\n"

I get an error message:

ValueError: illegal newline value: '\\\\n'

I've also tried keeping the data in the native bytes format and using io.BytesIO but I have the same problem.

triphook
  • 2,915
  • 3
  • 25
  • 34

2 Answers2

2

Try this:

np.genfromtxt(StringIO(b.decode() ))

The issue is probably with how you convert from bytes to string.

The output is:

array([[   9. ,   27. ,   70.4,   43.4,    0. ,   15.9, 3218.5],
       [   9. ,   28. ,   74.9,   43.7,    0. ,   18.3, 3236.8]])
Roy2012
  • 11,755
  • 2
  • 22
  • 35
2

You're using the wrong method to convert bytes to string. b'\n' should convert to '\n' (assuming an ASCII-compatible encoding).

The correct way is with bytes.decode(), for example:

>>> b'\n'.decode()
'\n'

For more details see Convert bytes to a string, though it also covers Python 2, which has a wildly different approach to binary data.

wjandrea
  • 28,235
  • 9
  • 60
  • 81