How to clone a SQLAlchemy object with new primary key

Question

I want to clone a SQLAlchemy object:

I tried:

product_obj = products.all()[0] #here products is service name

product_obj.product_uid = 'soemthing' #here product_uid is the pk of product model

products.save(product_obj)

but it just updates the old object.

Here is the code of products.save function:

class Service(object):

        __model__ = None

       def save(self, model):
            self._isinstance(model)
            db.session.add(model)
            db.session.commit()
            return model

score 38 · Answer 1 · edited Apr 30 '23 at 04:27

38

This should work:

product_obj = products.all()[0]

db.session.expunge(product_obj)  # expunge the object from session
make_transient(product_obj)  # http://docs.sqlalchemy.org/en/rel_1_1/orm/session_api.html#sqlalchemy.orm.session.make_transient

product_obj.product_uid = 'something'
db.session.add(product_obj)

make_transient

edited Apr 30 '23 at 04:27

Ahmad

8,811
11
76
141

answered Mar 05 '15 at 06:33

Tasos Vogiatzoglou

2,393
12
16

2

It seems rather annoying that make_transient doesn't remove the primary key. Wouldn't that make sense if creating a copy of an entry is the primary use case? – SebK Jan 12 '16 at 08:49
2

@SebK The thing is that you may need a copy of the object without altering the PK (for whatever reasons). By keeping the PK, the API is more inclusive, as it's more clean to change the ID when you need it than keeping an ID reference in case you need it. – Tasos Vogiatzoglou Jan 12 '16 at 12:47
10

Thanks! Also notes for newbees like me that (1) find make_transient in `from sqlalchemy.orm.session import make_transient` and (2) setting the primary key to None will then use the auto-generate primary key on `session.add(obj)` and `session.commit()`. – Michael Scott Asato Cuthbert Dec 03 '17 at 21:13
9

This is very dangerous as now all references in to that object will refer to the new object. Even references with different names obtained through different routes of execution. Safer to copy the object attribute by attribute. – Muposat Jun 12 '18 at 15:49
3

This "solution" makes a lot of values null for me. – Martin Thoma Jun 23 '19 at 16:26
2

I agree with Muposat that this solution is utterly dangerous. All previous variables referencing the same object will change afterwards and point to the copy. – Carlos Vega Oct 25 '21 at 16:24

score 27 · Answer 2 · edited Aug 18 '21 at 11:51

27

For sqlalchemy 1.3 I ended up using a helper function.

It copies all the non-primary-key columns from the input model to a new model instance.
It allows you to pass data directly as keyword arguments.
It leaves the original model object unmodified.

def clone_model(model, **kwargs):
    """Clone an arbitrary sqlalchemy model object without its primary key values."""
    # Ensure the model’s data is loaded before copying.
    model.id

    table = model.__table__
    non_pk_columns = [k for k in table.columns.keys() if k not in table.primary_key.columns.keys()]
    data = {c: getattr(model, c) for c in non_pk_columns}
    data.update(kwargs)

    clone = model.__class__(**data)
    db.session.add(clone)
    db.session.commit()
    return clone

With this function you can solve the above problem using:

product_obj = products.all()[0]  # Get the product from somewhere.
cloned_product_obj = clone_model(product_obj, product_uid='something')

Depending on your use-case you might want to remove the call to db.session.commit() from this function.

This answer is based on https://stackoverflow.com/a/13752442/769486 (How to get a model’s columns?) and How do I get the name of an SQLAlchemy object's primary key? (How do I get a model’s primary keys?).

edited Aug 18 '21 at 11:51

tomy0608

317
1
9

answered May 05 '19 at 11:15

zwirbeltier

867
9
27

1

Is anyone handling relationships with this kind of approach? – j_walker_dev Jul 30 '19 at 20:20
@j_walker_dev, are you asking about "deep clones" where you clone not just one object, but all of the contained/related objects? Or just handling relationships in the cloned objects? If the latter, I've had success just setting the relationship keys before adding to the session. Eg. I have a container table and an item table. I create a new container, clone a bunch of items, and set the container for each cloned item to the new container. There are some things that don't work until you commit, but it generally works fine – Aneel Jan 01 '20 at 04:31
1

This approach relies on the names of column attributes on your model matching the names of the columns themselves. Often, that's true - it's the usual practice and tends to be confusing to deviate from! - but it isn't guaranteed. A less pretty but more robust way of building the dict of attributes to copy, instead of using `.__table__.columns`, would be to loop over the model class's attributes using [`vars`](https://docs.python.org/3/library/functions.html#vars) and check if each attribute value is a `Column` using `isinstance`. – Mark Amery Dec 24 '21 at 16:41
[.__mapper__.columns](https://docs.sqlalchemy.org/en/14/orm/mapping_api.html?highlight=mapper#sqlalchemy.orm.Mapper.columns) lists the model's column names. – snakecharmerb Jan 20 '22 at 19:20
@j_walker_dev See my answer, for a method to copy column values and relationships (contained objects). – Jonathan Biemond Aug 30 '23 at 15:49

score 10 · Answer 3 · edited Nov 19 '21 at 23:28

10

One possible approach is to use dictalchemy, which extends SQLAlchemy:

Dictalchemy adds utils.asdict() and utils.fromdict() methods to SQLAlchemy models.

Example:

new_instance = InstanceModel(**old_instance.asdict())

edited Nov 19 '21 at 23:28

GG.

21,083
14
84
130

answered Aug 18 '17 at 13:20

Berislav Lopac

16,656
6
71
80

I am unable to find a class `InstanceModel` anywhere online. Is it in `dictalchemy` or `SQLAlchemy` or somewhere else? I've searched both codebases. – Zach Mierzejewski Mar 15 '19 at 16:44
3

It doesn't exist -- this is just a name for the imaginary example class. I might have made it unclear by not using the same variable names as the original question, but if you imagine that the model class of the products from the original questions was `ProductModel`, the first line would be something like `product_obj = ProductModel(**products.all()[0].asdict())` (provided that the `dictalchemy`). Alternatively you can find the object's class using `type`: `product_obj = type(products.all()[0])(**products.all()[0].asdict())`. – Berislav Lopac Mar 17 '19 at 23:10

score 3 · Answer 4 · edited Dec 24 '21 at 16:48

3

This method will clone any sqlalchemy db object. Add it to your model's class. Please note the the id of the new object will be created during commit (see comment):

def clone(self):
    d = dict(self.__dict__)
    d.pop("id") # get rid of id
    d.pop("_sa_instance_state") # get rid of SQLAlchemy special attr
    copy = self.__class__(**d)
    db.session.add(copy)
    # db.session.commit() if you need the id immediately
    return copy

edited Dec 24 '21 at 16:48

Mark Amery

143,130
81
406
459

answered Dec 16 '21 at 21:28

emmv

129
2

3

You don't need to commit to get an auto-generated `id` field to autopopulate; `db.session.flush()` should be enough. – Mark Amery Dec 24 '21 at 16:50
the created dict 'd' also contains relationships in my specific case. is it safe to pop those relationship key/value pairs as well? – linus Jul 20 '22 at 13:06
This approach worked for me as well. Thanks! – Hunter_71 Nov 14 '22 at 20:30

score 0 · Answer 5 · answered Dec 30 '22 at 14:45

I put together this function to clone objects. It does not assign a primary key, but it will set the existing primary keys to None and they are assigned by the autoincement (if set) once flushed to the database.

from sqlalchemy.orm.session import make_transient
from sqlalchemy import inspect

def clone_sql_object(obj, session):

    # remove the object from the session (set its state to detached)
    session.expunge(obj)

    # make it transient (set its state to transient)
    make_transient(obj)

    # now set all primary keys to None so there are no conflicts later
    for pk in inspect(obj).mapper.primary_key:
        setattr(obj, pk.name, None)

    return obj

Jonathan Biemond · Answer 6 · 2023-09-01T11:15:51.650

Here's a method to extend the Base class in SQLAlchemy 2.0 inspired by @zwirbeltier's answer. It also recursively clones relationships of the object. If you, like me, work in DB where primary keys aren't always autogenerated it accepts a primary key argument.

Note: Please test before implementing.

from __future__ import annotations

from typing import TypeVar

from sqlalchemy import inspect
from sqlalchemy.orm import DeclarativeBase, Mapper

_T = TypeVar("_T", bound="Base")


def db_to_attr(cls_mapper: Mapper, db_name: str) -> str:
    """Return the attribute name from the database name of a column."""
    for attr in cls_mapper.attrs:
        if type(attr).__name__ == "ColumnProperty":
            if attr.key == db_name:
                return attr.key
            elif cls_mapper.c[attr.key].name == db_name:
                return attr.key
    raise ValueError(f"Unknown database name: {db_name}")


def updated_pks(obj: _T, new_pk_vals: dict) -> dict:
    """Return a dict of primary keys updated with new_pk_vals."""

    cls_mapper = inspect(obj.__class__)

    pk_columns = {db_to_attr(cls_mapper, c.key): c for c in cls_mapper.primary_key}
    pk_vals = {key: getattr(obj, key) for key in pk_columns.keys()}

    # update only keys that are in the primary key
    for k in new_pk_vals.keys():
        # fall back to existing value if new value is None and there is no default
        default_val = (
            None
            if pk_columns[k].default or pk_columns[k].server_default
            else pk_vals[k]
        )
        pk_vals[k] = new_pk_vals[k] or default_val

    return pk_vals


class Base(DeclarativeBase):
    """Base class for SQLAlchemy model definitions."""

    def clone(
        self,
        pk_id: int | str | tuple | dict | None = None,
        include_relationships: bool = True,
        **kwargs,
    ) -> _T:
        """
        Clone the object with the given primary key and kwargs including FK relationships.

        Update the primary key and any other attributes passed as kwargs.
        Leave pk_id as None and SQLAlchemy will default to default if defined.
        If pk_id is a tuple, it must match the number of primary key columns.
        include_relationships will clone FK relationships recursively if True.

        :param pk_id: primary key value, tuple or dict of primary key values corresponding to the primary key columns
        :param include_relationships: clone FK relationship attributes recursively if True
        :param kwargs: attributes to update
        """

        obj_mapper = inspect(self)
        cls_mapper = inspect(self.__class__)

        pk_attrs = [db_to_attr(cls_mapper, c.key) for c in cls_mapper.primary_key]

        # check if the object is loaded
        if not obj_mapper.persistent and include_relationships:
            raise ValueError(
                "Object must be loaded or exclude relationships before cloning."
            )

        # coerce pk_id to dict
        if not isinstance(pk_id, dict):
            # extract new pk vals from kwargs if not passed
            if pk_id is None:
                pk_id = {k: kwargs.pop(k) for k in pk_attrs if k in kwargs}
            # coerce pk_id to tuple
            elif isinstance(pk_id, (int, str)):
                pk_id = (pk_id,)
                # get the primary key column names
                pk_id = dict(zip(pk_attrs, pk_id))

        # set missing pk values to None
        pk_id = {k: pk_id.get(k) for k in pk_attrs}

        # combine pk_id and kwargs
        updated = updated_pks(self, pk_id)
        updated.update(kwargs)

        # get model columns and values
        data = {
            column: getattr(self, column)
            for column in cls_mapper.columns.keys()
            if column not in pk_attrs
        }

        # get model relationships and values
        if include_relationships:
            for attr in cls_mapper.attrs:
                if type(attr).__name__ == "ColumnProperty":
                    continue

                elif type(attr).__name__ == "Relationship":
                    # skip write_only and viewonly relationships
                    if attr.lazy == "write_only" or attr.viewonly:
                        continue

                    remote_cls_mapper = attr.mapper

                    # skip relationships that where the parent key is on the remote side
                    remote_pk_columns = {c.key for c in remote_cls_mapper.primary_key}
                    remote_columns = {c.key for c in attr.remote_side}
                    if remote_pk_columns == remote_columns:
                        continue

                    # if there are no updated columns on the right side, don't clone
                    if not any(c in attr.mapper.column_attrs for c in updated.keys()):
                        data[attr.key] = getattr(self, attr.key)
                        continue

                    # get the updated relationship columns and values
                    updated_rel_vals = {
                        c.key: None
                        for c in remote_cls_mapper.primary_key
                        if c.server_default is not None
                    }
                    rel_attrs = [
                        db_to_attr(cls_mapper, c.key) for c in attr.local_columns
                    ]
                    rel_data = {c: updated[c] for c in rel_attrs if c in updated}
                    updated_rel_vals.update(rel_data)

                    # don't recurse more than one level if relationship is Many-to-Many
                    recurse = attr.direction.name != "MANYTOMANY"

                    # iterate over a collection and clone
                    # alternatively, attr.uselist
                    if attr.collection_class is not None:
                        collection = attr.collection_class()
                        for obj in getattr(self, attr.key):
                            collection.append(
                                obj.clone(
                                    include_relationships=recurse,
                                    **updated_rel_vals,
                                )
                            )
                        data[attr.key] = collection

                    else:
                        obj = getattr(self, attr.key)
                        if obj is not None:
                            data[attr.key] = obj.clone(**updated_rel_vals)

                else:
                    raise ValueError(f"Unhandled attribute type: {type(attr).__name__}")

        # update attributes
        data.update(updated)

        # create a new object
        cls = type(self)
        return cls(**data)

Edits

2023-09-01: Factor out helper methods and leave PK integrity validation to the DB.

How to clone a SQLAlchemy object with new primary key

6 Answers6

Edits

Linked