Is there a way to create an empty pandas dataframe from a pandera schema?
Given the following schema, I would like to get an empty dataframe as shown below:
from pandera.typing import Series, DataFrame
class MySchema(pa.DataFrameModel):
state: Series[str]
city: Series[str]
price: Series[int]
def get_empty_df_of_schema(schema: pa.DataFrameModel) -> pd.DataFrame:
pass
wanted_result = pd.DataFrame(
columns=['state', 'city', 'price']
).astype({'state': str, 'city': str, 'price': int})
wanted_result.info()
Desired result:
Index: 0 entries
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 state 0 non-null object
1 city 0 non-null object
2 price 0 non-null int64
Edit:
Found a working solution:
def get_empty_df_of_pandera_model(model: [DataFrameModel, MetaModel]) -> pd.DataFrame:
schema = model.to_schema()
column_names = list(schema.columns.keys())
data_types = {column_name: column_type.dtype.type.name for column_name, column_type in schema.columns.items()}
return pd.DataFrame(columns=column_names).astype(data_types)