This is really odd.
I have a file temp.txt
that has the following format:
1 1:1 1:1 *0.9 0 0 0.1 0 0
2 1:1 1:1 *1 0 0 0 0 0
3 1:1 1:1 *1 0 0 0 0 0
4 1:1 2:2 + 0.2 *0.7 0.1 0 0 0
5 1:1 1:1 *1 0 0 0 0 0
6 1:1 1:1 *0.9 0 0 0.1 0 0
7 1:1 1:1 *1 0 0 0 0 0
8 1:1 1:1 *1 0 0 0 0 0
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
6593 1:1 1:1 *1 0 0 0 0 0
The meaning of the numbers themselves is unimportant (it's WEKA output if anyone is curious). What I want is to take the number to the right of the second colon on each line and put them into a separate file classes.txt
with one line for each number, as follows:
1
1
1
2
1
.
.
.
I wrote the following Python script to accomplish this:
initial = open('temp.txt')
final = open('classes.txt','w')
for line in initial:
final.write(list(line.rsplit(':',1)[1])[0]+'\n')
It works perfectly for the first 5462 lines, but for no apparent reason it stops there. The numbers from the remaining 1131 lines (5463 - 6593) are absent from classes.txt
. I copied and pasted the skipped lines into a separate txt file and ran the script on THAT file but the resulting classes.txt
was empty.
This problem is really stumping me because I can see no obvious difference between lines 5462 and 5463, shown below:
5461 1:1 1:1 *1 0 0 0 0 0
5462 1:1 1:1 *1 0 0 0 0 0
5463 1:1 4:4 + 0.3 0 0 *0.6 0.1 0
5464 1:1 1:1 *0.8 0 0 0.2 0 0
For the record, I altered the script to print the lines to the console and it did that just fine. The problem appears to be with writing those lines to the file. Any help would be greatly appreciated.