I am querying a table containing billions of rows (BIG_TABLE
) in Oracle. The table is partitionned on a monthly basis (DTE
is the month of flux). The DBA wants us to filter our queries by month of flux. The DBA warned that (i) not doing so may cause slower queries and (ii) it may impact the others users so he may kill the non-filtered queries.
Since I need several months (sometimes up to 100), I use a parametrized query to pass the months I need (more below). I collect the results for all months in an intermediate table (INTERMEDIATE_TABLE
, containing about 200 million rows per month) and then I aggregate the data (FINAL_TABLE
, used for the analyses). The sum must be done by CHR
whatever the month of flux.
-- query1
CREATE TABLE
INTERMEDIATE_TABLE (
CHR VARCHAR2(255),
NBR NUMBER,
DTE DATE
);
-- query2
INSERT INTO
INTERMEDIATE_TABLE
SELECT
CHR,
NBR,
DTE
FROM
BIG_TABLE
WHERE
DTE = TO_DATE(?, 'YYYY-MM-DD');
-- query3
CREATE TABLE
FINAL_TABLE AS
SELECT
CHR,
SUM(NBR) AS NBR
FROM
INTERMEDIATE_TABLE
GROUP BY
CHR;
After saving the queries as strings (query1
, query2
, query3
), I use R's DBI to run the queries. dbExecute()
passes the dates one by one to the equal comparison in query2
, so the query is executed 3 times (i.e. the list is not passed in one go).
library(DBI)
dbConnect(odbc::odbc(), ...)
dbExecute(con, query1)
dbExecute(con, query2, params = list(c("2020-01-01", "2020-02-01", "2020-03-01")))
dbExecute(con, query3)
I would like INTERMEDIATE_TABLE
to be temporary. Can I INSERT INTO
a CTE?
I am open to alternative solutions but I am looking for an ANSI solution because a move to PostgreSQL is planned. I would like to avoid temporary table because only the data is temporary in Oracle, not the table. Of course I could just DROP
the table afterwards, but it seems more elegant/explicit (and efficient?) to make it temporary in the first place.
A reproducible example:
CREATE TABLE
BIG_TABLE (
CHR VARCHAR2(255),
NBR NUMBER,
DTE DATE
);
INSERT ALL
INTO BIG_TABLE VALUES ('A', 2, DATE '2020-01-01')
INTO BIG_TABLE VALUES ('B', 3, DATE '2020-01-01')
INTO BIG_TABLE VALUES ('A', 1, DATE '2020-02-01')
INTO BIG_TABLE VALUES ('B', 2, DATE '2020-02-01')
INTO BIG_TABLE VALUES ('A', 3, DATE '2020-02-01')
INTO BIG_TABLE VALUES ('B', 2, DATE '2020-03-01')
INTO BIG_TABLE VALUES ('B', 4, DATE '2020-03-01')
INTO BIG_TABLE VALUES ('C', 1, DATE '2020-03-01')
INTO BIG_TABLE VALUES ('B', 4, DATE '2020-04-01')
INTO BIG_TABLE VALUES ('D', 2, DATE '2020-05-01')
SELECT 1 FROM DUAL;
Desired output:
CHR NBR
A 6
B 11
C 1