0

I've tried several solutions from similar problems, but so far, no luck. I know it's probably simple.

I have two pandas dataframes. One contains temperatures and months, df1. The other contains months and a possible range of temperatures, df2. I would like to count how many times a temperature for a particular month occurs based on df2.

df1:

Month  Temp  
 1      10
 1      10
 1      20
 2      5
 2      10
 2      15

df2:

Month  Temp  
 1      0
 1      5
 1      10
 1      15
 1      20
 1      25
 2      0
 2      5
 2      10
 2      15
 2      20
 2      25

desired output with a new columns, Count, in df2:

Month  Temp  Count
 1      0      0
 1      5      0
 1      10     2
 1      15     0
 1      20     1
 1      25     0
 2      0      0
 2      5      1
 2      10     1
 2      15     1
 2      20     0
 2      25     0
import pandas as pd

df1 = pd.DataFrame({'Month': [1]*3 + [2]*3,
                   'Temp': [10,10,20,5,10,15]})

df2 = pd.DataFrame({'Month': [1]*6 + [2]*6,
                   'Temp': [0,5,10,15,20,25]*2})

df2['Count'] = 
J. McConnor
  • 101
  • 1
  • 10
  • Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) – Amit Gupta Jul 16 '21 at 18:43

4 Answers4

1

Try this:

(df2.join(
    df.groupby(['Month','Temp']).size().rename('count'),
    on=['Month','Temp'])
 .fillna(0))
rhug123
  • 7,893
  • 1
  • 9
  • 24
1

An approach using value_counts and reindex:

new_index = pd.MultiIndex.from_frame(df2)
new_df = (
    df1.value_counts(["Month", "Temp"])
    .reindex(new_index, fill_value=0)
    .rename("Count")
    .reset_index()
)

    Month  Temp  Count
0       1     0      0
1       1     5      0
2       1    10      2
3       1    15      0
4       1    20      1
5       1    25      0
6       2     0      0
7       2     5      1
8       2    10      1
9       2    15      1
10      2    20      0
11      2    25      0
Cameron Riddell
  • 10,942
  • 9
  • 19
1

Another solution:

x = (
    df1.assign(Count=1)
    .merge(df2, on=["Month", "Temp"], how="outer")
    .fillna(0)
    .groupby(["Month", "Temp"], as_index=False)
    .sum()
    .astype(int)
)
print(x)

Prints:

    Month  Temp  Count
0       1     0      0
1       1     5      0
2       1    10      2
3       1    15      0
4       1    20      1
5       1    25      0
6       2     0      0
7       2     5      1
8       2    10      1
9       2    15      1
10      2    20      0
11      2    25      0
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
1

try:

res = (df2.set_index(['Month', 'Temp'])
       .join(df1.value_counts().to_frame(name='count'))
       .reset_index().fillna(0).astype(int))

OR

di = df1.value_counts().to_dict()
df2['count'] = df2.apply(lambda x: 0 if tuple(x) not in di.keys()  else di[tuple(x)], axis=1)

    Month   Temp    count
0   1       0       0
1   1       5       0
2   1       10      2
3   1       15      0
4   1       20      1
5   1       25      0
6   2       0       0
7   2       5       1
8   2       10      1
9   2       15      1
10  2       20      0
11  2       25      0
Pygirl
  • 12,969
  • 5
  • 30
  • 43