0

I want to check whether the date I am working with is already existing in a DataFrame. If yes, I would like to add values from another DataFrame to the count column for this date's row. I no, I would like just to concat new row to the DataFrame.

My Data Frames look like this:

result_df:

       date     count
0 '2021-01-01'    10           
1 '2021-01-02'    13       
...

temp_df:

      date      count          
0 '2021-01-02'    8    

The code I am using is following:

date = '2021-01-02'
if result_df.loc[result_df['date'].isin([date])].any():
   result_df.loc[result_df['date'] == date, 'count'] += temp_df['count']
else:
   result_df = pd.concat([result_df, temp_df], ignore_index=True)

As a result I have following error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Please help me to solve this issue.

3 Answers3

0

If you are matching for one date only, like the sample data, you can try:

date = '2021-01-02'
result_df.loc[result_df['date'].isin([date]), 'count'] += temp_df.loc[temp_df['date'] == date, 'count'].values

Result:

print(result_df)

         date  count
0  2021-01-01     10
1  2021-01-02     21

The reason your codes get the error is because Python if-statement cannot handle a Pandas Series which contains multiple values. As your code with .isin() actually return a boolean Series with 2 True/False values corresponding to the checking results of the 2 rows of the 2 dates. Hence, the error. You have to replace your use of Python if-statement to only Pandas / numpy codes to properly handle it.

SeaBean
  • 22,547
  • 3
  • 13
  • 25
0

You want to check if the date is in the column if I understand you correctly. That can be done like this:

date = '2021-01-02'
if date in result_df['date'].values:
   result_df.loc[result_df['date'] == date, 'count'] += temp_df['count']
else:
   result_df = pd.concat([result_df, temp_df], ignore_index=True)
Rutger
  • 593
  • 5
  • 11
0

Specific to what you're trying, you can get the matching dates using this:

result_df['date'].isin(temp_df['date'])
0    False
1     True
Name: date, dtype: bool

But since you want add count from the two df's when date matches, use df.merge() to create a dataframe with the necessary matches, and then add the count from the df on the 'right':

merged = result_df.merge(temp_df, how='left', on='date', suffixes=('', '_y')).fillna(0)
result_df['count'] += merged['count_y']

Note that this will match on each date which is common in both df's, not just one date value. So you can do this for all matching dates temp_df in one go.

fillna is used to put in a 0 for the counts which are missing in temp_df. In the merged dataframe, it has _y suffix:

           date  count  count_y
0  '2021-01-01'     10      0.0
1  '2021-01-02'     13      8.0

Final result_df is:

           date  count
0  '2021-01-01'   10.0
1  '2021-01-02'   21.0

For the error you're getting, see this question - Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

aneroid
  • 12,983
  • 3
  • 36
  • 66
  • Thank you a lot for giving me a fresh view. – Katsiaryna Shkirych Jun 17 '21 at 20:59
  • @KatsiarynaShkirych You're welcome. And welcome to StackOverflow! If that helped, read: [What should I do when someone answers my question?](https://stackoverflow.com/help/someone-answers), and about [accepting](https://meta.stackexchange.com/a/5235/193893) and [voting](https://stackoverflow.com/privileges/vote-up). – aneroid Jun 18 '21 at 11:57