I have two tables in redshift:
- tbl_current_day - about 4.5M rows
- tbl_previous_day - about 4.5M rows, with the same data exactly as tbl_current_day
In addition to it, I have a view called qry_both_days defined as following:
CREATE OR REPLACE qry_both_days AS
SELECT * FROM tbl_current_day
UNION SELECT * FROM tbl_previous_day;
When I run a query on one of the separate tables, I get very good performance as expected. For example, the following query runs 5 seconds:
select count(distinct person_id) from tbl_current_day;
-- (person_id is of type int)
Explain plan:
XN Aggregate (cost=1224379.82..1224379.82 rows=1 width=4)
-> XN Subquery Scan volt_dt_0 (cost=1224373.80..1224378.61 rows=481 width=4)
-> XN HashAggregate (cost=1224373.80..1224373.80 rows=481 width=4)
-> XN Seq Scan on tbl_current_day (cost=0.00..979499.04 rows=97949904 width=4)
Note that width is 4 bytes, as it's supposed to be, as my column is of type int.
HOWEVER, when I run the same query on qry_both_days the query runs 20 times slower, while I would expect it to run only 2 times slower, as it should go over twice more rows:
select count(distinct person_id) from qry_both_days;
Explain plan:
XN Aggregate (cost=55648338.34..55648338.34 rows=1 width=4)
-> XN Subquery Scan volt_dt_0 (cost=55648335.84..55648337.84 rows=200 width=4)
-> XN HashAggregate (cost=55648335.84..55648335.84 rows=200 width=4)
-> XN Subquery Scan qry_both_days (cost=0.00..54354188.49 rows=517658938 width=4)
-> XN Unique (cost=0.00..49177599.11 rows=517658938 width=190)
-> XN Append (cost=0.00..10353178.76 rows=517658938 width=190)
-> XN Subquery Scan "*SELECT* 1" (cost=0.00..89649.20 rows=4482460 width=190)
-> XN Seq Scan on tbl_current_day (cost=0.00..44824.60 rows=4482460 width=190)
-> XN Subquery Scan "*SELECT* 2" (cost=0.00..90675.00 rows=4533750 width=187)
-> XN Seq Scan on tbl_previous_day (cost=0.00..45337.50 rows=4533750 width=187)
The problem: width is now 190, not 4 bytes as it's supposed to be!!! Anybody knows how to make RedShift pick only the relevant columns on UNION SELECT?
Thanks!