0

I want the code to check if the dataframe has all elements of period_list. If not I want to add that element to the dataframe, the values associated with the element will be zero. I wrote this and it didn't work Dataframe: test_1

    Period       A     B    C
0   2018 - Q2    1     0    1
1   2018 - Q3    1     1    1
2   2018 - Q4    0     1    1
3   2019 - Q1    0     0    0
4   2019 - Q2    0     0    1
5   2019 - Q3    1     0    1
6   2019 - Q4    0     1    1
7   2020 - Q1    1     0    1
8   2020 - Q2    0     0    0

my code

period_list = ['2018 - Q1', '2018 - Q2', '2018 - Q3', '2018 - Q4', '2019 - Q1', '2019 - Q2', '2019 - Q3', '2019 - Q4', '2020 - Q1', '2020 - Q2', '2020 - Q3', '2020 - Q4', '2021 - Q1']

for row in period_list:
    if row not in test_1.iloc[:, 0]:
        test_1.append(row)
anky
  • 74,114
  • 11
  • 41
  • 70
RM swiftie
  • 13
  • 4
  • Related: [Merge two dataframes by index](https://stackoverflow.com/questions/40468069/merge-two-dataframes-by-index) – smci Jun 19 '21 at 20:49

2 Answers2

1

Try with set_index + reindex + reset_index:

df = df.set_index('Period').reindex(period_list, fill_value=0).reset_index()

df:

       Period  A  B  C
0   2018 - Q1  0  0  0
1   2018 - Q2  1  0  1
2   2018 - Q3  1  1  1
3   2018 - Q4  0  1  1
4   2019 - Q1  0  0  0
5   2019 - Q2  0  0  1
6   2019 - Q3  1  0  1
7   2019 - Q4  0  1  1
8   2020 - Q1  1  0  1
9   2020 - Q2  0  0  0
10  2020 - Q3  0  0  0
11  2020 - Q4  0  0  0
12  2021 - Q1  0  0  0

DataFrame and list used:

import pandas as pd

df = pd.DataFrame({'Period': ['2018 - Q2', '2018 - Q3', '2018 - Q4',
                              '2019 - Q1', '2019 - Q2', '2019 - Q3',
                              '2019 - Q4', '2020 - Q1', '2020 - Q2'],
                   'A': [1, 1, 0, 0, 0, 1, 0, 1, 0],
                   'B': [0, 1, 1, 0, 0, 0, 1, 0, 0],
                   'C': [1, 1, 1, 0, 1, 1, 1, 1, 0]})

period_list = ['2018 - Q1', '2018 - Q2', '2018 - Q3', '2018 - Q4', '2019 - Q1',
               '2019 - Q2', '2019 - Q3', '2019 - Q4', '2020 - Q1', '2020 - Q2',
               '2020 - Q3', '2020 - Q4', '2021 - Q1']
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
1

A right join on indices with df.merge() does this automatically, per Merge two dataframes by index:

df.merge(pd.DataFrame(index=period_list), left_index=True, right_index=True, how='right')
                  A       B      C
2018 - Q1       NaN     NaN    NaN
2018 - Q2       1.0     0.0    1.0
2018 - Q3       1.0     1.0    1.0
2018 - Q4       0.0     1.0    1.0
2019 - Q1       0.0     0.0    0.0
2019 - Q2       0.0     0.0    1.0
2019 - Q3       1.0     0.0    1.0
2019 - Q4       0.0     1.0    1.0
2020 - Q1       1.0     0.0    1.0
2020 - Q2       0.0     0.0    0.0
2020 - Q3       NaN     NaN    NaN
2020 - Q4       NaN     NaN    NaN
2021 - Q1       NaN     NaN    NaN

Then as @anky wrote you want to .fillna(0,downcast='infer') to convert NaNs to 0s, and downcast from float to integer:

           A  B  C
2018 - Q1  0  0  0
2018 - Q2  1  0  1
2018 - Q3  1  1  1
2018 - Q4  0  1  1
2019 - Q1  0  0  0
2019 - Q2  0  0  1
2019 - Q3  1  0  1
2019 - Q4  0  1  1
2020 - Q1  1  0  1
2020 - Q2  0  0  0
2020 - Q3  0  0  0
2020 - Q4  0  0  0
2021 - Q1  0  0  0
smci
  • 32,567
  • 20
  • 113
  • 146