Remove rows with values repeated on specific columns

Asked Jan 06 '18 at 14:32

Active Jan 06 '18 at 14:32

Viewed 21 times

I have a Dataframe of user reviews as the following:

Index   User   Location  Rating Langauge
    1    bob     62.354    4       eng
    2    bil     59.511    5       span
    3    bob     63.884    3       ger
    4    juan    58.221    4       jap
    5    bil     59.511    5       eng
    6    bil     57.422    5       fra

I'm trying to eliminate duplicate rows or reviews given that they are equal in the 'User' and 'Location' columns.

My desired output would be something like this:

Index   User   Location  Rating Language
    1    bob     62.354    4       eng
    2    bil     59.511    5       span
    3    bob     63.884    3       ger 
    4    juan    58.221    4       jap 
    6    bil     57.422    5       fra

Where the 5th row got deleted because it was a duplicate of the second row, given that the 'User' and 'Location' columns on both were the same. Keep in mind that usernames are unique to users and the location is unique to a place. Other variables are just categorical.

Thank you. This has been driving me crazy.

asked Jan 06 '18 at 14:32

Juan R.

1

You need `df = df.drop_duplicates(['User','Location'])`, it is dupe :( – jezrael Jan 06 '18 at 14:36
Thank you. Did not how to word it correctly and therefore could not find it. – Juan R. Jan 06 '18 at 14:37
Yes, I understand you. The hardest is know what is necessary use for keywords... – jezrael Jan 06 '18 at 14:38
Anyway, worked like a charm. So thank you very much! – Juan R. Jan 06 '18 at 14:39
You are welcome, you can upvote my solution in dupe, thanks. – jezrael Jan 06 '18 at 14:40

Remove rows with values repeated on specific columns

0 Answers0