I'm currently loading in a file as below: 1400,,,2001,101,1000,1,07,08,332,8,2,,,,1,,9,,21,,36,,39,,53,,68,,95,,,,,0,8,,, 1400,,,2001,101,1000,2,07,08,222,11,1,,,,1,,1,,2,,12,,13,,21,,48,,112,,,,,0,11,,, 1400,,,2001,101,1001,1,07,08,24,0,0,,,,0,,1,,3,,7,,2,,3,,3,,5,,,,,0,0,,, 1400,,,2001,101,1001,2,07,08,14,0,0,,,,0,,0,,0,,3,,1,,4,,0,,6,,,,,0,0,,, 1400,,,2001,101,1002,1,07,08,0,0,0,,,,0,,0,,0,,0,,0,,0,,0,,0,,,,,0,0,,, 1402,,,2001,101,I25,1,07,08,0,0,0,,,,0,,0,,0,,0,,0,,0,,0,,0,,,,,0,0,,, 1401,,,2001,101,I26,2,07,08,0,0,0,,,,0,,0,,0,,0,,0,,0,,0,,0,,,,,0,0,,,
All of the columns should be ints, instead of the 6th column (values like 1000, I25) which I've set to be a string. I load the file in as follows:
data = np.genfromtxt(sys.argv[1], dtype=(int,int,int,int,int,"|S25",int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int), skip_header=1, delimiter=",")
The reason I have to do this is because otherwise it thinks everything is an int and sets the 6th column to -1.
I then set a mask so only lines set to 1400 are printed:
mask_country = (data[:,0] == 1400)
This, however, gives the error:
Traceback (most recent call last):
File "Python/iw2.py", line 14, in <module>
mask_country = (data[:,0] == 1400)
IndexError: too many indices
It's strange, because if I get rid of the dtype=() from the genfromtxt line, OR just specify all the variables as in with dtype=int it runs perfectly.
Why does specifying the data type for the columns individually result in this error?
If I don't set the mask I can print 'data' and it seems to be setting things correctly, as the last line is as follows:
(1401, -1, -1, 2001, 101, 'I26', 2, 7, 8, 0, 0, 0, -1, -1, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, 0, -1, -1, -1, -1, 0, 0, -1, -1, -1)]