I have a .txt file which has the following format:
1 a 0.01 0.03 0.01 ...
2 b 0.04 0.03 0.01 ...
which may contain any number of columns with additional numbers. I need to find the index of the maximum value in each row for a large (2.5 million) number of rows.
So far my approach was to preallocate an array and read the file line by line. I've been trying to avoid reading the whole file into memory due to its size:
import numpy as np
indices = []
with open(file.txt) as f:
for line in f:
numbers = [float(s) for s in line[2:]]
indices.append(np.argmax(numbers))
This takes very long however and I am wondering if there is a more efficent method/package I could use?