I am trying to create a function that uses df.iterrows()
and Series.nlargest
. I want to iterate over each row and find the largest number and then mark it as a 1
. This is the data frame:
A B C
9 6 5
3 7 2
Here is the output I wish to have:
A B C
1 0 0
0 1 0
This is the function I wish to use here:
def get_top_n(df, top_n):
"""
Parameters
----------
df : DataFrame
top_n : int
The top number to get
Returns
-------
top_numbers : DataFrame
Returns the top number marked with a 1
"""
# Implement Function
for row in df.iterrows():
top_numbers = row.nlargest(top_n).sum()
return top_numbers
I get the following error: AttributeError: 'tuple' object has no attribute 'nlargest'
Help would be appreciated on how to re-write my function in a neater way and to actually work! Thanks in advance