1

I have the following dataframe:

     reference | topcredit | currentbalance | creditlimit
  1      1      |    50     |       20       |      70
  2      1      |    30     |       28       |      50
  3      1      |    50     |       20       |      70
  4      1      |    81     |       32       |      100
  5      2      |    70     |        0       |      56
  6      2      |    50     |       20       |      70
  7      2      |   100     |        0       |      150
  8      3      |    85     |       85       |      95
  9      3      |    85     |       85       |      95

And so on...

I want to drop duplicates based on the 'reference' only those that have the same topcredit, currentbalance and creditlimit.

In the reference 1 I have two that have the same numbers in the three columns in line 1 and 3, but also in reference 2, line 6 I would like to keep 1 of reference 1 and also line 6 of reference 2. In reference 3 both lines have the same information too.

The expected output is:

 reference | topcredit | currentbalance | creditlimit
    1      |    50     |       20       |      70
    1      |    30     |       28       |      50
    1      |    81     |       32       |      100
    2      |    70     |       24       |      56
    2      |    50     |       20       |      70
    2      |   100     |       80       |      150
    3      |    85     |       85       |      95

I would apreciate the help, I've been searching how to do it for a while.

esg
  • 55
  • 8
  • 1
    `df.drop_duplicates()`, or am I missing something? – ALollz Mar 08 '19 at 18:16
  • How do you do it based on 3 conditions: topcredit, currentbalance and creditlimit being the same for two rows in each reference? If I drop duplicates on current balance or any other column reference 2 would've been dropped too. – esg Mar 08 '19 at 18:18
  • 1
    You drop duplicates on **all** columns (i.e. just specify nothing), so that a row gets removed only if the same reference has the same topcredit and currentbalance and creditlimit. – ALollz Mar 08 '19 at 18:23
  • 1
    wow, thanks! I was thinking of very complicated solutions that I didn't think of such simple one. Thanks! @ALollz – esg Mar 08 '19 at 18:36
  • 1
    Possible duplicate of [Drop all duplicate rows in Python Pandas](https://stackoverflow.com/questions/23667369/drop-all-duplicate-rows-in-python-pandas) – ALollz Mar 08 '19 at 18:38

0 Answers0