Finding the top 3 locations with most snow and store it

Question

This is my code right now

df = df2016.nlargest(3,['Snow Mean'])
df.to_csv('top3.csv')

However, I am getting duplicated locations with this code, because all my top snow is from the same location, this code is only giving me one location three times.

Here is my data. On the image, the second column is the location and the last column is snow. I want my code to store Grand Rapids 2.5 ENE, and Grand Rapids 4.6 ESE as top 2 but my code is storing all top 3's the same location. I also tried dropping duplicates but it didn't work. How can I find the top 3 without duplicates?

Please clarify what exactly the issue is? _I also tried dropping duplicates but it didn't work._ We need a [mcve], then. Also, please do not share information as images unless absolutely necessary. See: https://meta.stackoverflow.com/questions/303812/discourage-screenshots-of-code-and-or-errors, https://idownvotedbecau.se/imageofcode, https://idownvotedbecau.se/imageofanexception/. — AMC, Apr 11 '20 at 03:13
Does this answer your question? [Drop all duplicate rows in Python Pandas](https://stackoverflow.com/questions/23667369/drop-all-duplicate-rows-in-python-pandas) — AMC, Apr 11 '20 at 03:14
Thank you for trying to help, but my question was already solved and I did flag an answer. — cccccc, Apr 11 '20 at 03:23
That doesn’t mean the question’s life is over though, much to the contrary. — AMC, Apr 11 '20 at 03:24

score 1 · Accepted Answer · answered Apr 11 '20 at 01:42

1

df2016.sort_values('Snow Mean', ascending=False).drop_duplicates(subset='Location Column', keep='first').head(3)

answered Apr 11 '20 at 01:42

Eric Truett

2,970
1
16
21

Alexander Popov · Answer 2 · 2020-07-17T05:35:30.993

I think pandas.Dataframe.drop_duplicates() should work fine. You should just use only the second and the last column before you use this function because the 3d(third) column with the data has unique values.

    df = df2016['column2', 'column5'].drop_duplicates.nlargest(3, ['Snow Mean'])
    df.to_csv('top3.csv')

or

    df = df2016.iloc[:, [1, 5]].drop_duplicates.nlargest(3, ['Snow Mean'])
    df.to_csv('top3.csv')

This should work as you want until there are no unique values in last column with snow data in the same region.

Finding the top 3 locations with most snow and store it

2 Answers2