Goal here is to find the columns that does not exist in df and create them with null values.
I have a list of column names like below:
column_list = ('column_1', 'column_2', 'column_3')
When I try to check if the column exists, it gives out True for only columns that exist and do not get False for those that are missing.
for column in column_list:
print df.columns.isin(column_list).any()
In PySpark, I can achieve this using the below:
for column in column_list:
if not column in df.columns:
df = df.withColumn(column, lit(''))
How can I achieve the same using Pandas?