I want to read a txt file using numpy's genfromtxt. The file t.txt looks as follows:
###############
PSZ1 G096.89+24.17
PSZ1 G108.18−11.53
RXC J0225.1−2928
RXC J1053.7+5452
RXC J1234.2+0947
RXC J1314.4−2515
S 1081
ZwCl 0008.8+5215
ZwCl 2341+0000
1E 0657−558
1RXS J0603.3+4214
24P 73
I import numpy and run genfromtxt as follows:
import numpy as np
a =np.genfromtxt("t.txt", comments="#", dtype=None,autostrip=True,delimiter = " ")
and that returns the following when issuing print a:
array([['PSZ1', 'G096.89+24.17'],
['PSZ1', 'G108.18\xe2\x88\x9211.53'],
['RXC', 'J0225.1\xe2\x88\x922928'],
['RXC', 'J1053.7+5452'],
['RXC', 'J1234.2+0947'],
['RXC', 'J1314.4\xe2\x88\x922515'],
['S', '1081'],
['ZwCl', '0008.8+5215'],
['ZwCl', '2341+0000'],
['1E', '0657\xe2\x88\x92558'],
['1RXS', 'J0603.3+4214'],
['24P', '73']],
dtype='|S15')
I would like to know what causes the additional stings containing \x and how to get ride of them, while still using genfromtxt.
Further, many other methods of reading strings return the same problem (the additional \x strings), even when directly copying the example from this post (t.txt) directly to a txt or csv file.
I created the file t.txt in the atom editor, which says in the bottom UTF8. I also saved the file again as UTF8.
How can I properly read the falsely encoded + and - signs in python without changing them individually by hand?
Thanks