I have a DELETE queries in Redshift that takes up to 40 seconds in productions. The queries are created programatically is looks like
EXPLAIN DELETE FROM platform.myTable WHERE id IN ('77258ef98a7a87a0.W01.7388.c930db5e66203047','77258ef98319adc4.W01.73ad.c930db5e66203047','77258ef97d8ff761.W01.73be.c930db5e66203047','77258ef985a5c1be.W01.738a.c930db5e66203047')
And the result of the EXPLAIN query shows
XN Seq Scan on mytable (cost=0.00..23725.54 rows=5 width=6)
Filter: (((id)::text = '77258ef97d8ff761.W01.73be.c930db5e66203047'::text) OR ((id)::text = '77258ef98319adc4.W01.73ad.c930db5e66203047'::text) OR ((id)::text = '77258ef985a5c1be.W01.738a.c930db5e66203047'::text) OR ((id)::text = '77258ef98a7a87a0.W01.7388.c930db5e66203047'::text))
And my table schema looks like
CREATE TABLE IF NOT EXISTS platform.myTable(
userId varchar(150) encode lzo,
eventTime timestamp encode delta32k,
id varchar(150) encode lzo,
typeId varchar(150) encode lzo,
...
)
distkey (typeId)
compound sortkey (eventTime, typeId, userId ... id);
Would adding a primary key of 'id' to the table make the DELETE query a point looking up instead of a scan and filter operation?