1

Context

SQLModel is heavily based on Pydantic. The latter has the create_model function allowing you to create/define a model class at runtime, by passing the field definitions as arbitrary keyword arguments.

There seems to be no special version of the create_model function built into SQLModel and there is no mention of dynamic model creation in the documentation.

Question

Is it possible to leverage the Pydantic create_model function to define a fully-functioning SQLModel class?

Requirements

Functioning in the sense that it will properly serve as an ORM class, play nice with a database engine, and its instances can be added to and refreshed by database sessions, just like any statically defined SQLModel subclass.

As a proof of concept, it should be possible to dynamically construct a working equivalent of the following statically defined model and perform the aforementioned operations with it:

from typing import Optional

from sqlmodel import Field, SQLModel

class Hero(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    name: str
    secret_name: str
    age: Optional[int] = None
Daniil Fajnberg
  • 12,753
  • 2
  • 10
  • 41

2 Answers2

1

The answer is yes!

The create_model function has the optional __base__ parameter (as mentioned in the docs), which accepts any subclass (or sequence of subclasses) of the Pydantic BaseModel. The SQLModel base class happens to directly inherit from BaseModel and can thus be passed here.

However, this is not sufficient to have a model that maps to a table. The SQLModelMetaclass requires table=True to be passed as a keyword argument during subclassing of SQLModel. Luckily, there is a solution for this built into Pydantic as well.

While this is mentioned nowhere on Pydantic's documentation website, the create_model function (source here) has a __cls_kwargs__ parameter for being able to pass arbitrary keyword arguments to the metaclass during class creation.

These two components, together with the actual field definitions, are actually all we need to dynamically create our ORM class. Here is a full working example:

from typing import Optional

from pydantic import create_model
from sqlmodel import Field, Session, SQLModel, create_engine

field_definitions = {
    "id": (Optional[int], Field(default=None, primary_key=True)),
    "name": (str, ...),
    "secret_name": (str, ...),
    "age": (Optional[int], None),
}

Hero = create_model(
    "Hero",
    __base__=SQLModel,
    __cls_kwargs__={"table": True},
    **field_definitions,
)

if __name__ == '__main__':
    sqlite_url = "sqlite:///test.db"
    engine = create_engine(sqlite_url, echo=True)
    SQLModel.metadata.create_all(engine)
    session = Session(engine)
    hero = Hero(name="Spider-Boy", secret_name="Pedro Parqueador")
    session.add(hero)
    session.commit()
    session.refresh(hero)
    print(hero)

That print statement gives the following output:

secret_name='Pedro Parqueador' id=1 age=None name='Spider-Boy'

That demonstrates that the id was created by the database upon insertion.

The SQL statements printed to stdout by the engine show that everything went as planned:

CREATE TABLE hero (
    id INTEGER NOT NULL, 
    name VARCHAR NOT NULL, 
    secret_name VARCHAR NOT NULL, 
    age INTEGER, 
    PRIMARY KEY (id)
)

...

INSERT INTO hero (name, secret_name, age) VALUES (?, ?, ?)
('Spider-Boy', 'Pedro Parqueador', None)

...

SELECT hero.id, hero.name, hero.secret_name, hero.age 
FROM hero 
WHERE hero.id = ?
(1,)

So far, I have not encountered any caveats to this approach, beyond those that apply to dynamic model creation in Pydantic as well, such as the obvious lack of static type checking support or auto-suggestions, if a model was defined dynamically.


Tested with pydantic>=1.10.4,<2.* and sqlmodel==0.0.8.

Daniil Fajnberg
  • 12,753
  • 2
  • 10
  • 41
  • With this approach the field_definitions for every table must be statically defined. I don't understand how this is considered 'dynamic'. My understanding of 'dynamic' would be to generate a SQLModel model without having to create static definitions of the fields beforehand. We should be able to use any database and reflect the tables as models without knowing what the fields are for each table. – Liquidgenius Mar 23 '23 at 14:24
  • @Liquidgenius It seems you are operating under a different definition of _"dynamic"_ in this context. When talking about the definition of Pydantic models, _dynamic_ refers to the fact that the entire model definition only happens at runtime and that the `field_definitions` are constructed as a dictionary as opposed to _statically_ defining the model **class** as you would normally do. It means you could e.g. make the entire model definition dependent upon some runtime parameters and thus different from run to run. – Daniil Fajnberg Mar 23 '23 at 14:31
  • @Liquidgenius See [this section of the Pydantic docs](https://docs.pydantic.dev/usage/models/#dynamic-model-creation) for reference. I am using the term in that sense. And no, the `field_definitions` do not have to be statically defined. That is the entire point. What you are interested in is something else entirely. The ability to generate a model from any existing database schema. I am not aware of any tools that can do that. If you find something like this, I would be very interested to know about it. – Daniil Fajnberg Mar 23 '23 at 14:32
  • I am describing Reflection, it is a dynamic process of 'looking' at a database table and generating a Model definition directly from the table meta (without externally provided definitions). SQLAlchemy provides this, and given that SQLModel dependencies are both SQLAlchemy and Pydantic, I am searching for a way to accomplish (dynamic) Reflection with SQLModel. https://docs.sqlalchemy.org/en/20/core/reflection.html – Liquidgenius Mar 23 '23 at 14:50
  • 1
    I wonder if you could use SQLAlchemy's Reflection, and dynamically generate your `field_definitions` from the meta, then use your process to generate the model. That would allow fully dynamic creation of a model without knowing beforehand what fields existed in the database table... – Liquidgenius Mar 23 '23 at 14:57
  • When I run your code, I get this error at the `session.add` step: `sqlalchemy.orm.exc.UnmappedInstanceError: Class 'pydantic.main.Hero' is not mapped`. Do you know why? – nsheff Mar 23 '23 at 21:29
  • @nsheff You probably forgot to add the `table=True` keyword-argument via `__cls_kwargs__`. – Daniil Fajnberg Mar 23 '23 at 21:30
  • @DaniilFajnberg hmm... no, I literally copy/pasted the code from here into a file... it's there... -- it was a version issue. I updated SQLmodel/pydantic and now it works. – nsheff Mar 24 '23 at 00:08
0

My intention with 'dynamically' was to find a way to produce models that could be utilized at runtime without the need for the developer to define them in any way.

That being said, sqlacodegen is a command line tool that accomplish just that. Sqlacodegen can dynamically produce models from a database by simply supplying a proper connection url. Sqlacodegen leverages sqlalchemy, and it's automapping features.

The following code is a proof of concept which utilizes Sqlacodegen to produce SQLModel models dynamically. It supports in-memory production via the get_models() method, and production to an importable module with the save() method.

You will need to install SQLModel and the pre-release version of sqlacodegen.

Installs

pip install sqlmodel
pip install --pre sqlacodegen

OR

poetry add sqlmodel
poetry add sqlacodegen@3.0.0rc2

Usage

modeler = ModelSql(db_url="dialect+driver://user:password@host:port/database")
models = modeler.get_models()
modeler.save()

Code

import inspect
import importlib
import subprocess
from pathlib import Path
import importlib.metadata
from typing import List, Optional, Dict, Any, Type
from packaging.version import parse as parse_version


class ModelSql:
    """
    A wrapper class for dynamically generating in-memory models from a database using SQLModel and sqlacodegen.
    Please reference sqlacodegen and sqlalchemy for in-depth documentation.

    Sqlacodegen is in need of help. It's a great tool, please consider lending a hand to the project:
    https://github.com/agronholm/sqlacodegen

    Args:
        db_url (Optional[str]): The database URL. Defaults to None.
        generator (str): The code generator to use. Defaults to "sqlmodels".
        options (Optional[List[str]]): Additional options for the code generator. Defaults to None.
        only (Optional[List[str]]): List of table names to include. Defaults to None.

    Example:
        > db_url = ""
        > modeler = ModelSql(db_url="dialect+driver://user:password@host:port/database")
        > models = modeler.get_models()  # use dynamically
        > modeler.save()  # save for later import

    TODO: Implement support for other code generators supported by sqlacodegen.
    """

    def __init__(
        self,
        db_url: Optional[str] = None,
        generator: str = "sqlmodels",
        options: Optional[List[str]] = None,
        only: Optional[List[str]] = None,
    ) -> None:
        self.db_url = db_url
        self.generator = generator
        self.options = options
        self.only = only
        self.model_base_cls = None
        self.code = ""
        self.models: Dict[str, Any] = {}
        self._workflow()

    def _workflow(self) -> None:
        """
        Execute the workflow for generating models from the database.
        """
        self._check_min_module_version("sqlacodegen", "3.0.0rc2")

        if self.generator == "sqlmodels":
            self._check_min_module_version("sqlmodel", "0.0.8")
            module = importlib.import_module("sqlmodel")
            self.model_base_cls = getattr(module, "SQLModel")

        self._generate_code()

        if self.generator == "sqlmodels":
            self._compile_models()

    def _generate_code(self) -> None:
        """
        Generate the code using sqlacodegen.
        """
        cmd = ["sqlacodegen", "--generator", self.generator, self.db_url]

        if self.options:
            cmd.extend(["--option", option] for option in self.options)

        if self.only:
            cmd.extend(["--tables", ",".join(self.only)])

        process = subprocess.Popen(cmd, stdout=subprocess.PIPE, text=True)
        stdout, _ = process.communicate()
        self.code = stdout

    def _compile_models(self) -> None:
        """
        Compile the generated code and populate the models dictionary.
        """
        compiled_code = compile(self.code, "<string>", "exec")
        module_dict: Dict[str, Any] = {}
        exec(compiled_code, module_dict)
        self.models = module_dict

    @staticmethod
    def _check_min_module_version(module_name: str, min_version: str) -> None:
        """
        Checks if the specified module has a minimum required version.

        Args:
            module_name (str): The name of the module to check.
            min_version (str): The minimum required version in string format.

        Raises:
            ValueError: If the module version is lower than the minimum required version.
            ModuleNotFoundError: If the module is not installed.
            RuntimeError: If an error occurs while checking the module version.
        """
        try:
            module_version = importlib.metadata.version(module_name)
            if parse_version(module_version) < parse_version(min_version):
                raise ValueError(
                    f"{module_name} version {min_version} or greater is required, but found version {module_version}."
                )
        except importlib.metadata.PackageNotFoundError as e:
            raise ModuleNotFoundError(f"{module_name} is not installed.") from e
        except Exception as e:
            raise RuntimeError(
                f"An error occurred while checking the version of {module_name}: {str(e)}"
            ) from e

    def get_models(self, only: Optional[List[str]] = None) -> List[Type[Any]]:
        """
        Get a list of generated models.

        Args:
            only (Optional[List[str]]): List of model names to include. Defaults to None.

        Returns:
            List[Type[Any]]: List of generated model classes.
        """
        if only is None:
            return [
                model_cls
                for model_cls in self.models.values()
                if inspect.isclass(model_cls) and issubclass(model_cls, self.model_base_cls)
            ]
        else:
            return [
                model_cls
                for model_cls in self.models.values()
                if (
                    inspect.isclass(model_cls)
                    and issubclass(model_cls, self.model_base_cls)
                    and model_cls.__name__ in only
                )
            ]

    def save(
        self,
        filename: str = "models",
        to_path: Optional[str] = None
    ) -> None:
        """
        Save the generated models to files.

        Args:
            filename (str): Name of the file. Defaults to "models" (".py" extension will be added if not present).
            to_path (Optional[str]): Path to save the file. Defaults to None (current working directory).
        """
        to_path = Path(to_path) if to_path else Path.cwd()

        filename += "" if filename.endswith(".py") else ".py"

        with open(to_path / filename, "w") as file:
            file.write(self.code)
Liquidgenius
  • 639
  • 5
  • 17
  • 32