1

I have created a database with pandas :

import numpy as np                                                                                                                                                                                          
import sqlite3                                                                                                                                                                                              
import pandas as pd                                                                                                                                                                                         
import sqlite3                                                                                                                                                                                              
import sqlalchemy                                                                                                                                                                                           
from sqlalchemy import create_engine                                                                                                                                                                        
from sqlalchemy.orm import sessionmaker                                                                                                                                                                     

df = pd.DataFrame(np.random.normal(0, 1, (10, 2)), columns=['A', 'B'])                                                                                                                                      

path = 'sqlite:////home/username/Desktop/example.db'                                                                                                                                                        

engine = create_engine(path, echo=False)                                                                                                                                                                    

df.to_sql('flows', engine, if_exists='append', index=False)                                                                                                                                                 

# This is only to show I am able to read the database                                                                                                                                                                                                            
df_l = pd.read_sql("SELECT * FROM flows WHERE A>0 AND B<0", engine)                                                                                                                                         

Now I would like to add one or more indexes to the database. Is this case I would like to make first only the column A and then both the columns indices.

How can I do that?

If possible I would like a solution that uses only SqlAlchemy so that it is independent from the choice of the database.

Donbeo
  • 17,067
  • 37
  • 114
  • 188
  • Normally one would use a migration tool such as [alembic](http://alembic.zzzcomputing.com/en/latest/) to handle DB migrations, but not sure how well it'd fit your use case. – Ilja Everilä Sep 19 '17 at 07:57
  • My use case is very simple. Any procedure that will not require too much work (time spent learning a new tool) is good. – Donbeo Sep 19 '17 at 08:05

2 Answers2

2

You should use reflection to get hold of the table that pandas created for you.

With reference to:

SQLAlchemy Reflecting Database Objects

A Table object can be instructed to load information about itself from the corresponding database schema object already existing within the database. This process is called reflection. In the most simple case you need only specify the table name, a MetaData object, and the autoload=True flag. If the MetaData is not persistently bound, also add the autoload_with argument:

you could try this:

meta = sqlalchemy.MetaData()
meta.reflect(bind=engine)
flows = meta.tables['flows']
# alternative of retrieving the table from meta:
#flows = sqlalchemy.Table('flows', meta, autoload=True, autoload_with=engine)

my_index = sqlalchemy.Index('flows_idx', flows.columns.get('A'))
my_index.create(bind=engine)

# lets confirm it is there
inspector = reflection.Inspector.from_engine(engine)
print(inspector.get_indexes('flows'))
craigsparks
  • 134
  • 6
0

This seems to work for me. You will have to define the variables psql_URI, table, and col yourself. Here I assume that the table name / column name may be in (partial) uppercase but you want the name of the index to be lowercase.

Derived from the answer here: https://stackoverflow.com/a/72976667/3406189

import sqlalchemy
from sqlalchemy.orm import Session

engine_psql = sqlalchemy.create_engine(psql_URI)
autocommit_engine = engine_psql.execution_options(isolation_level="AUTOCOMMIT")
with Session(autocommit_engine) as session:
    session.execute(
        f'CREATE INDEX IF NOT EXISTS idx_{table.lower()}_{col.lower()} ON sdi_ai."{table}" ("{col}");'
    )

grofte
  • 1,839
  • 1
  • 16
  • 15