1

I have a postgresql db table with 2+ million records. I need to narrow these down to a list of 100k.

I have a list of values to filter by:

codes = ['17389', '77289', ...]

And I have a working SQLalchemy statement:

stmt = BarCodes.__table__.delete().where(~BarCodes.code.in_(codes))

The only issue is that this statement takes a long time to run. I'm guessing that it is going through the codes each time to check for the value. Is there a more efficient way to do this?

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
Sean Payne
  • 1,625
  • 1
  • 8
  • 20
  • Related https://stackoverflow.com/q/34263149/5320906 – snakecharmerb Apr 04 '21 at 13:58
  • It compiles to a single query, but depending on your version of SQLA compiling huge lists to IN clauses can take quite long. Not to mention that the query itself will likely take long to execute, especially if you have lots of cascading foreign key constraints. – Ilja Everilä Apr 04 '21 at 15:09
  • hi, have you an index on column code ? – mshabou Apr 04 '21 at 18:39
  • @snakecharmerb that ended up solving my problem. I also needed to add an index to the column code. I had to create a temporary table as well since my list was getting pulled from an excel table. – Sean Payne Apr 06 '21 at 11:59

0 Answers0