I am trying to merge to Pandas DataFrame as below:
import numpy as np
import pandas as pd
df = pd.DataFrame({"vals": np.random.RandomState(31).randint(-30, 30, size=15),
"grps": np.random.RandomState(31).choice(["A", "B"], 15)})
mean = df.groupby('grps').mean().rename(columns={'vals':'mean'})
df.merge(df, mean, left_on='grps', right_index=True)
But I got this error:
...
/opt/conda/lib/python3.7/site-packages/pandas/core/reshape/merge.py in _maybe_coerce_merge_keys(self)
1144 inferred_right in string_types and inferred_left not in string_types
1145 ):
-> 1146 raise ValueError(msg)
1147
1148 # datetimelikes must match exactly
ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat
I think the types of the merge key in these two DataFrame are the same. Why did this error happen?
2020-05-02 update:
@wwnde pointed out the error may be related to different index sizes of the two DataFrame. But how to explain the following working example:
df1 = pd.DataFrame({'employee': ['Bob', 'Jake', 'Lisa', 'Sue'],
'group': ['Accounting', 'Engineering', 'Engineering', 'HR']})
df2 = pd.DataFrame({'group': ['Accounting', 'Accounting',
'Engineering', 'Engineering', 'HR', 'HR'],
'skills': ['math', 'spreadsheets', 'coding', 'linux',
'spreadsheets', 'organization']})
print(df1)
print(df2)
pd.merge(df1, df2)
I reset_index
for mean
DataFrame, this is a new error:
mean = df.groupby('grps').mean().rename(columns={'vals':'mean'}).reset_index()
df.merge(df, mean, on='grps')
Error:
/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in __nonzero__(self)
1477 def __nonzero__(self):
1478 raise ValueError(
-> 1479 f"The truth value of a {type(self).__name__} is ambiguous. "
1480 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
1481 )
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().