I need a new column C
where each value is the frequency with which the values in two other columns A
and B
appear together in the data.
A B C
0 7 9 2
1 7 2 2
2 1 9 3
3 4 8 1
4 9 1 1
5 6 4 1
6 7 2 2
7 7 9 2
8 1 9 3
9 1 9 3
I tried making a dictionary out of a value count like this:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'A': np.random.randint(1, 10, 100),
'B': np.random.randint(1, 10, 100)
})
mapper = df.value_counts().to_dict()
Then I convert each row to a tuple and feed it back through the dictionary in pandas' apply function:
df['C'] = df.apply(lambda x: mapper[tuple(x)], axis=1)
This solution seems possibly (a) incorrect or (b) inefficient, and I'm wondering if there's a better way of going about it.