Calculate number of occurrence of common element in Pandas

Question

I have the following data:

data_SP = pd.read_csv(path_ServicePath, sep='\t', header=None)
data_SP.columns = ["Id#SP"]
print(data_SP)

    IdSP
0   SH11
1   SH11
2   SH12
3   SH12
4   SH13
5   SH13
6   SH13
7   SH13
... ...

and also,

data_SH = pd.read_csv(path_ServiceHierarchy,sep='\t',header=None)
data_SH.columns = ["Id#SH", "ConnectionClient"]
print(data_SH)
IdSH    ConnectionClient
0   SH11    CN01207507
1   SH12    CN01207509
2   SH13    CN01207539
3   SH14    CN01207542
4   SH15    CN01207564
... ...

In fact, I would like to calculate the number of occurrence of data_SH elements, in data_SP. In other words, how many every single element in data_SH.['Id#SH'] is repeated in data_SP As an Instance:

SH11 = 2
SH12 = 2
SH13 = 4

Shubham Sharma · Accepted Answer · 2020-10-04T17:17:30.137

Let's use Series.where + Series.isin to filter the values in IdSP which are also present in IDSH, then use value_counts to count the number of occurrences:

data_SP['IdSP'].where(lambda x: x.isin(data_SH['IdSH'])).value_counts()

Alternatively you can also do value_counts + reindex:

data_SP['IdSP'].value_counts().reindex(data_SH['IdSH']).dropna()

IdSH
SH11    2.0
SH12    2.0
SH13    4.0
Name: IdSP, dtype: float6

score 2 · Answer 2 · answered Oct 04 '20 at 17:02

2

Best way to count values in a column is the following:

data_SH.value_counts()

Refer also to this question on SO

answered Oct 04 '20 at 17:02

JacoSolari

1,226
14
28

gilf0yle · Answer 3 · 2020-10-04T17:16:18.810

1

this gives count but doesn't give the count of elements that are present in data_sh and not present in data_SP.

    data_SP = pd.read_csv(path_ServicePath, sep='\t', header=None)
    data_SP.columns = ["Id#SP"]
    data_SP["count"] = [1]*len(df)
    data_SP=data_SP.groupby("Id#SP").count()

edited Oct 04 '20 at 17:16

answered Oct 04 '20 at 17:11

gilf0yle

1,092
3
9

Calculate number of occurrence of common element in Pandas

3 Answers3