Following scenario:
main-table having millions of rows (around 40M++)
related table having about 2M rows. (lets call it table_b)
We are currently migrating from Informix DB to MS-SQL, and I thought, that maybe I can optimize the data-schema somewhat...
In the old DB, there is a trigger which copies the contents of a column ("name") to the "maintable" (maintable.table_b_name) to optimize the filtering on "tableb.name".
I did some tests in the new DB, and it turns out, that in some scenarios (select * from maintable) filtering via "exists (select 'x' from tableb where...) is just as fast as filtering on the copied column.
In other scenarios however, the filtering on the copied column is a lot faster (twice as fast) (to be more specific: "select a.column, b.column from maintable a inner join sometable on [SOMECONDITION]")
So - now my question:
- would filtering on a copied column always be faster, or are we just missing some indices? (Execution Plan does not hint to any missing ones though).
- why would the "select * from maintable" be about the same speed when filtering via "exists (subselect)" vs. filtering on the copied column? But be a LOT slower when there is a join involved?
I would very much like to eliminate the copy...cause...40M * redundant data = lots of wasted space....(which isnt good from a performance perspective I suppose).