-1

This is my code right now

df = df2016.nlargest(3,['Snow Mean'])
df.to_csv('top3.csv')

However, I am getting duplicated locations with this code, because all my top snow is from the same location, this code is only giving me one location three times.

enter image description here

Here is my data. On the image, the second column is the location and the last column is snow. I want my code to store Grand Rapids 2.5 ENE, and Grand Rapids 4.6 ESE as top 2 but my code is storing all top 3's the same location. I also tried dropping duplicates but it didn't work. How can I find the top 3 without duplicates?

AMC
  • 2,642
  • 7
  • 13
  • 35
cccccc
  • 25
  • 5
  • Please clarify what exactly the issue is? _I also tried dropping duplicates but it didn't work._ We need a [mcve], then. Also, please do not share information as images unless absolutely necessary. See: https://meta.stackoverflow.com/questions/303812/discourage-screenshots-of-code-and-or-errors, https://idownvotedbecau.se/imageofcode, https://idownvotedbecau.se/imageofanexception/. – AMC Apr 11 '20 at 03:13
  • Does this answer your question? [Drop all duplicate rows in Python Pandas](https://stackoverflow.com/questions/23667369/drop-all-duplicate-rows-in-python-pandas) – AMC Apr 11 '20 at 03:14
  • Thank you for trying to help, but my question was already solved and I did flag an answer. – cccccc Apr 11 '20 at 03:23
  • That doesn’t mean the question’s life is over though, much to the contrary. – AMC Apr 11 '20 at 03:24

2 Answers2

1
df2016.sort_values('Snow Mean', ascending=False).drop_duplicates(subset='Location Column', keep='first').head(3)
Eric Truett
  • 2,970
  • 1
  • 16
  • 21
0

I think pandas.Dataframe.drop_duplicates() should work fine. You should just use only the second and the last column before you use this function because the 3d(third) column with the data has unique values.

    df = df2016['column2', 'column5'].drop_duplicates.nlargest(3, ['Snow Mean'])
    df.to_csv('top3.csv')

or

    df = df2016.iloc[:, [1, 5]].drop_duplicates.nlargest(3, ['Snow Mean'])
    df.to_csv('top3.csv')

This should work as you want until there are no unique values in last column with snow data in the same region.