Find most common elements in all rows uisng DataFrame

Question

Hope you all doing well

I want to find the most common elements in a DataFrame the elements that appear in all rows

I have this function , but it's work only when the DataFrame have two rows.

def find_common_elements(df):
    df = df.drop(['motif', 'frequency', 'motif_cleaned'], axis=1)
    df = df.dropna(axis=1, how='all')
    # Get the set of elements in the first row
    common_elements = set(df.iloc[0])
    # Iterate over the remaining rows and update the common elements
    for i in range(1, len(df)):
        common_elements.intersection_update(set(df.iloc[i]))
    return common_elements

Please anyone can help

For example if i have this data # Sample data

data = {
    'A': [1, 1, 1, 1, 1],
    'B': [2, 1, 2, 2, 1],
    'C': [1, 2, 0, 0, 2]
}

the most common elements here 1 and 2 that appear in all rows

Thank you !

It_is_Chris · Answer 1 · 2023-06-07T14:45:18.620

1

Try using value_counts with nlargest

import numpy as np
import pandas as pd

# sample data
np.random.seed(1)
df = pd.DataFrame(np.random.randint(1, 10, (20,5)), columns=list('abcde'))

# stack every column into one
# get the counts of each value
# return the 5 largest (you can change this to any number you want to return)
df.stack().value_counts().nlargest(5)

8    20 # number 8 occurs 20 times
9    12 # number 9 occurs 12 times
1    12 # number 1 occurs 12 times
5    12 # number 5 occurs 12 times
2    11 # number 2 occurs 11 times

UPDATE

Try using functools.reduce with numpy.intersect1d

import pandas as pd
import numpy as np
from functools import reduce

# your sample dataframe
data = { 'A': [1, 1, 1, 1, 1], 'B': [2, 1, 2, 2, 1], 'C': [1, 2, 0, 0, 2] }
df = pd.DataFrame(data)

# use reduce with np.intersect1d
reduce(np.intersect1d, df.values) # -> array([1, 2])

edited Jun 07 '23 at 14:45

answered Jun 06 '23 at 18:48

It_is_Chris

13,504
2
23
41

I think you misunderstand me , i want to display only the elements that appear in all rows for example if i have this data # Sample data data = { 'A': [1, 1, 1, 1, 1], 'B': [2, 1, 2, 2, 1], 'C': [1, 2, 0, 0, 2] } So the most common elements here 1 and 2 – Myriam_2189 Jun 06 '23 at 19:18
@Myriam_2189 sorry, you are correct, I did misunderstand what you were looking for. Please see the update to my answer. – It_is_Chris Jun 07 '23 at 14:26

Find most common elements in all rows uisng DataFrame

1 Answers1

UPDATE