What is the most efficency way to find index in large dataframe

Asked Dec 19 '22 at 09:08

Active Dec 19 '22 at 09:08

Viewed 21 times

I have a large dataframe(6M rows) and it's like below

Column A	Column B
000001	AB1234
000002	CD1234

The Column A is unique but Column B is not I have some index list to query this large df and I want to get the Column B value for every index as my result. index list is like below and I have 4K such lists and the length of each list is big. query_list = ['000002', '000003', '000014', '000101']

Running on Python3.x, Jupyter Notebook, Pandas 1.3.x

I have tried df.query() and df[df["column name"].str.contain.()] but both of them take many time.

df.query() cost 57x s
df[df["column name"].str.contain.()] cost 7xx s

And I have also tried to run this code with Pool.map() but it didn't work.

Is there any solution?

asked Dec 19 '22 at 09:08

橘橘仔

What is the most efficency way to find index in large dataframe

0 Answers0