0

I'm working with SQLAlchemy models which look something like this:

class mytableclass(Base):
   __tablename__ = "table_name"

   __table__ = Table(
       __tablename__,
       Base.metadata,
      Column("Customer_id", String, primary_key=True),
      Column("Full_name", String),
      Column("Occupation", String),
 )

I have around 50-60 of these python files with the sqlalchemy table definition, and I would need to make some updates to them. I have a csv where I've defined class names and what needs to be updated, like add a column, or change column type, or add some metadata to a column.

Now since I'm working with such large masses of data I'm wondering how I would achieve this programatically, and not need to do it manually, since it will be required in the future.

One approach is to treat the python files as strings and build some logic around that. But I have a feeling that since its a python data structure then there should be some "python" way of achieving this. All suggestions are welcome.

Anton
  • 581
  • 1
  • 5
  • 23

1 Answers1

1

You could dynamically load the entire set of models like the below prototype. It doesn't handle a lot of edge cases like passing more info to a table or building relationships.

I think the real problem will be when you add a column you have to alter the table and potentially manipulate/migrate the existing data for it to make sense. Something like alembic can attempt to auto generate migrations but most of the time you have to customize those or add parts that could not be automatically detected. So you could just reload all the models from the csv and then run the manually coded migration to update the actual data in the database. I don't think it would really be worth it compared to just changing the model file manually since you still have to code up the migration.

Even if the models were stored in JSON and your migration altered both the db and the JSON it would be a huge custom thing you'd always be working around. Ie. This doesn't handle association tables that don't have a corresponding class. There would be so many edge cases. Interesting idea though.

There are some links at end.

Anyways, this was my thought:

#necessary sqlalchemy imports ...


Base = declarative_base()


engine = create_engine("sqlite://", echo=False)

# Constrain allowed types by filtering them through whitelist.
allowed_types = dict([(t.__name__, t) for t in (String, Integer,)])

# Extract this from CSV.
models = [{
    'classname': 'Customer',
    'tablename': 'customers',
    'cols': [
        {'name': 'customer_id', 'type': 'String', 'extra_kwargs': {'primary_key': True}},
        {'name': 'full_name', 'type': 'String'},
        {'name': 'occupation', 'type': 'String'},
    ],
}]

# Create class and table for each model and then shove the class
# into the globals()
for model in models:
    cols = [Column(col['name'], allowed_types[col['type']], **col.get('extra_kwargs', {})) for col in model['cols']]
    globals()[model['classname']] = type(model['classname'], (Base,), {
        '__tablename__': model['tablename'],
        '__table__': Table(
            model['tablename'],
            Base.metadata,
            *cols),
    })

Base.metadata.create_all(engine)

with Session(engine) as session:
    customer = Customer(customer_id='some-guy', full_name='Some Guy', occupation='Some Job')
    session.add(customer)
    customers = session.query(Customer).all()
    assert len(customers) == 1
    assert customers[0].customer_id == 'some-guy'

alembic operations

dynamic class generation

type built-in

Ian Wilson
  • 6,223
  • 1
  • 16
  • 24
  • Hi Ian, thanks for the answer, it definitely helps me get on the right track. Some clarifications though: I don't have the complete list of columns, I only know those that need to be edited, the rest of the columns need to be read from the class and tabledefinition somehow. Lastly, how would I dump this thing to a .py file? Or does this globals() achieve it somehow? I'm unfamiliar with this and will need to read up on it. – Anton Sep 10 '21 at 07:25
  • @Anton I thought more about this and it isn't really what you asked. I think manipulating python code itself will be really difficult. This solution just creates Python sqlalchemy classes dynamically from JSON-style definitions. So it would replace all the model files with a single file that held all the models produced from the JSON. I'm not sure it would save time because the implementation is not complete and might be hard to fully complete. – Ian Wilson Sep 11 '21 at 04:28