I have a data table that has ~74 million lines that I used blaze to load it.
from blaze import CSV, data
csv = CSV('train.csv')
t = data(csv)
It has fields these: A, B, C, D, E, F, G
Since this is such a large dataframe, how can I efficiently output rows that fit specific criteria? For example, I would want rows that have A==4, B==8, E==10. Is there a way to multitask the look-up? For example, by threading or parallel programming or something?
By parallel programming I mean for example, one thread will try to find the matching row from row 1 to row 100000, and the second thread will try to find the matching row from row 100001 to 200000, and so on...