I have a list of lists just shy of two million elements, each with 7 entries.
I run a machine learning algorithm on the data and would like to append the result of the classification to the end of each element.
I use the .append()
feature, something like
for j in range(len(data)):
data[j].append(results[j])
However, this takes a lot of time (8+ hours and it had still not terminated).
I'm wondering if there is a more efficient way to do this. The data is read in from a CSV file, so maybe I could write the results directly into the CSV?
I was thinking about using numpy arrays, but I recall someone saying that lists are faster.
Anyone have any ideas?
EDIT: here is my code
import csv
with open("measles_data_b", 'r') as f:
reader = csv.reader(f)
t = list(reader)
### Perform the machine Learning. That bit works fine.
#At this point, t is a list with size=1971203, and each element in t has 7 elements of its own
# results is a list with the same number of elements. Its entries are
# one of three things: '1','2','0'.
for j in range(len(t)):
t[j].append(results[j])