How to show columns that have different values in rows?

Question

I have pandas DataFrame with 2 rows:

+---------+---------+-----------+------------+
|       ID|     Type|      Index|        Code|
+---------+---------+-----------+------------+
|111111111|       aa|          1|         XXX|
|111111111|       aa|       null|         XXX|

How can I show those columns that have different values between rows of df? (there might be more than 2 rows).

In the above-shown example, the expected output is Index, because the first row has Index equal to 1 and the second row has Index equal to null.

Are you looking for `nunique`? `df.columns[df.nunique(axis=0, dropna=False).gt(1)] ` — cs95, Aug 05 '19 at 19:58
Are you trying to compare values across every row? Or consecutive rows only? Are you claiming that every row should have the same column values (with exceptions)? — cs95, Aug 05 '19 at 20:01
So, can you confirm that `df.columns[df.nunique(axis=0, dropna=False).gt(1)]` is what you want? Or maybe you want to do this per ID as rafaelc mentioned. — cs95, Aug 05 '19 at 20:09

score 2 · Accepted Answer · answered Aug 05 '19 at 20:18

You can simply select the column/s (or the dataframe) with columns having unique values more than 1. You can get those columns whose values are distinct with

def cols_having_unique(df):
    my_cols = []
    for col in df.columns:
        if df[col].nunique(dropna=False) > 1:
            my_cols.append(col)
    return df[my_cols].copy()

new_df = cols_having_unique(df)

score 1 · Answer 2 · answered Aug 05 '19 at 20:15

1

You can use this code:

for col in df.columns:
   if len(df[col].value_counts()) > 1:
      print('The column ' +  col + ' has different values')

answered Aug 05 '19 at 20:15

Noha Elprince

1,924
1
16
10

Bitto · Answer 3 · 2019-08-05T20:23:02.250

Loop through pandas.DataFrame.columns and check if two rows have the same value for every column.

import pandas as pd
data=[
    [111111111,'aa',1,'XXX'],
    [111111111,'aa','null','XXX'],
    [111111111,'aa','null','YYY']
]
df=pd.DataFrame(data, columns=["ID", "Type", "Index", "Code"])

def get_different_columns(r1, r2):
    different_col_list=[]
    for column in df.columns:
        if df.iloc[r1][column] != df.iloc[r2][column]:
            different_col_list.append(column)
    return different_col_list

print(get_different_columns(0,1))
#['Index']

print(get_different_columns(1,2))
#['Code']

print(get_different_columns(0,2))
#['Index', 'Code']

score -3 · Answer 4 · answered Aug 05 '19 at 19:58

-3

I suppose you can count distinct values in a single column by iterating through all the columns.

How to count distinct values: Count unique values with pandas per groups

answered Aug 05 '19 at 19:58

Peiqin

378
4
12

2

If this is a duplicate, please vote to close as duplicate or leave a comment. **Do not** make an answer that just links to the duplicate. – user3483203 Aug 05 '19 at 20:00

How to show columns that have different values in rows?

4 Answers4