How to find the nlargest from a large csv file using pandas (chunked)?

Question

Given a very large csv file with many rows and 3 columns:

the file is read as following :

import pandas as pd

df = pd.read_csv("test.csv", sep=" ", chunksize=100000)

Now how to get the N largest rows based on the values in the 3rd column when chunkzise is utilized ?

Does this answer your question? [How to get top 5 values from pandas dataframe?](https://stackoverflow.com/questions/47462690/how-to-get-top-5-values-from-pandas-dataframe) — Tomerikoo, Sep 09 '21 at 11:57
Does this answer your question? [Get first and second highest values in pandas columns](https://stackoverflow.com/q/39066260/6045800) — Tomerikoo, Sep 09 '21 at 11:58

score 0 · Answer 1 · answered Sep 09 '21 at 09:50

0

Try this:

print(df.nlargest(N, columns=df.columns[2]))

answered Sep 09 '21 at 09:50

U13-Forward

@Win It should I guess, you should try, remember to accept and upvote if it helps – U13-Forward Sep 09 '21 at 12:42
dosent work, it gives "AttributeError: 'TextFileReader' object has no attribute 'nlargest' " – Win Sep 09 '21 at 13:31

1 Answers1