How to convert a list of Pydantic BaseModels to Pandas Dataframe

Question

I can't seem to find any built-in way of simply converting a list of Pydantic BaseModels to a Pandas Dataframe.

from pydantic import BaseModel
import pandas as pd

class SomeModel(BaseModel):
    col1: str
    col2: str

data = [SomeModel(**{'col1': 'foo', 'col2': 'bar'})] * 10
pd.DataFrame(data)

Output

>>         0            1
>> 0  (col1, foo)  (col2, bar)
>> 1  (col1, foo)  (col2, bar)
>> ...

In this way the columns are loaded as data. A workaround is to do the following

pd.DataFrame([model.dict() for model in data])

Output

>>    col1 col2
>> 0  foo  bar
>> 1  foo  bar
>> ...

However this method is a bit slow for larger amounts of data. Is there a faster way?

camo · Answer 1 · 2020-11-06T07:15:02.743

27

A quick and dirty profiling yield the following values:

from pydantic import BaseModel
import pandas as pd
from fastapi.encoders import jsonable_encoder
class SomeModel(BaseModel):
    col1: int
    col2: str

data = [SomeModel(col1=1,col2="foo"),SomeModel(col1=2,col2="bar")]*4*10**5

import cProfile

cProfile.run( 'pd.DataFrame([s.dict() for s in data])' ) # around 8.2s
cProfile.run( 'pd.DataFrame(jsonable_encoder(data))' ) # around 30.8s
cProfile.run( 'pd.DataFrame([s.__dict__ for s in data])' ) # around 1.7s
cProfile.run( 'pd.DataFrame([dict(s) for s in data])' ) # around 3s

edited Nov 06 '20 at 07:15

answered Nov 05 '20 at 15:20

camo

422
4
9

Would it be possible to add `vars(s)` to the comparison (it should be a more pythonic alternative to `s.__dict__` if it's the same performance-wise)? – Dev-iL May 14 '23 at 13:24

score 12 · Answer 2 · answered Aug 12 '20 at 12:16

12

Not sure if it's faster, but FastAPI exposes jsonable_encoder which essentially performs that same transformation on an arbitrarily nested structure of BaseModel:

from fastapi.encoders import jsonable_encoder
pd.DataFrame(jsonable_encoder(data))

answered Aug 12 '20 at 12:16

patricksurry

5,508
2
27
38

How to convert a list of Pydantic BaseModels to Pandas Dataframe

2 Answers2

Linked