0
SELECT
  subs_key,
  sum(ROUNDED_DATA_VOLUME) AS RDV_SUM, 
  CASE WHEN to_char(CALL_START_TIME , 'HH24:MI:SS') >= '00:00:00'
        AND to_char(CALL_START_TIME , 'HH24:MI:SS') <= '07:00:00' THEN 'Night'
       WHEN to_char(CALL_START_TIME , 'HH24:MI:SS') >  '07:00:00'
        AND to_char(CALL_START_TIME , 'HH24:MI:SS') <= '23:59:59' THEN 'Day' 
  END AS Tariff_flag
FROM DWH.FCT_USAGE_PREP_OGPRS_N
WHERE CALL_START_TIME >= to_date('2021-11-01', 'YYYY-MM-DD') 
  AND CALL_START_TIME <= to_date('2021-11-30', 'YYYY-MM-DD')
GROUP BY
  SUBS_KEY,
  CASE WHEN (to_char(CALL_START_TIME , 'HH24:MI:SS') >= '00:00:00'
    AND to_char(CALL_START_TIME, 'HH24:MI:SS') <= '07:00:00') THEN 'Night'
       WHEN (to_char(CALL_START_TIME , 'HH24:MI:SS') > '07:00:00'
    AND to_char(CALL_START_TIME, 'HH24:MI:SS') <= '23:59:59') THEN 'Day' 
  END

My query takes more than hour and still running. Is there any way to optimize it?

UPD:

Execution Plan

Is that what Ankit asked?

The Impaler
  • 45,731
  • 9
  • 39
  • 76
condexter
  • 13
  • 5
  • 5
    Please post the execution plan – Ankit Bajpai Feb 07 '22 at 14:28
  • 1
    CALL_START_TIME <= to_date('2021-11-30','YYYY-MM-DD') will only give you through midnight the morning of 11-30. You probably want CALL_START_TIME < to_date('2021-12-01','YYYY-MM-DD'). This won't speed anything up, but will hopefully help you get the query correct – pohart Feb 07 '22 at 16:14
  • @pohart thanks. will try – condexter Feb 07 '22 at 16:23
  • What is the table partitioned by & what version of Oracle are you using? – pohart Feb 07 '22 at 16:32
  • @pohart. that I do not know, unfortunately – condexter Feb 07 '22 at 16:34
  • 2
    Do you have an index on `CALL_START_TIME`? – Wernfried Domscheit Feb 07 '22 at 16:39
  • @WernfriedDomscheit. no I don't – condexter Feb 07 '22 at 16:44
  • Please learn [here](https://stackoverflow.com/a/34975420/4808122) how to create an execution plan in *text* from and post is complete inclusive the *Predicates*. Note that according to your plan that table has **39 rows**, is it right? (AKA *stale statistics*). I'd also recommend to check `GV$SESSION` to see if your query is realy *running* or your session is *blocked*. – Marmite Bomber Feb 07 '22 at 17:14
  • 2
    First of all, create the index on CALL_START_TIME, as in: `create index ix1 on DWH.FCT_USAGE_PREP_OGPRS_N (CALL_START_TIME)`. Then, run again and see if it improves. If not, there are other optiizations that can also be done, but I would start with the obvious one. – The Impaler Feb 07 '22 at 17:37
  • @TheImpaler I do not have enought privileges to do that – condexter Feb 07 '22 at 18:54
  • @condexter You don't have privileges because it's a production database, or because the permissions are wrongly set? Anyway, this is a basic setting that you should be able to test in a development sandbox without any trouble. – The Impaler Feb 07 '22 at 19:18
  • You need to collect actual execution plan with real statistics via [`dbms_sql_monitor.report_sql_monitor`](https://docs.oracle.com/en/database/oracle/oracle-database/21/arpls/DBMS_SQL_MONITOR.html) or `dbms_xplan.display_cursor` with `report_level/format` set to `'ALL ALLSTATS LAST'`. It will show you what is going on during the query execution and which operation takes most of the time. – astentx Feb 08 '22 at 07:50
  • As long as your table is partitioned and no partition pruning appears and the query returns 39 rows only and there's a PX execution (for some reason), it looks like either statistics is inaccurate or the query may be redesigned. What is the size of your table in terms of rows and bytes and what is the partitioning of the table? – astentx Feb 08 '22 at 07:53
  • 1
    Partitioned by CALL_START_TIME, 4 billion rows – condexter Feb 08 '22 at 10:13
  • 1
    Please, show table definition. It is unlikely to be partitioned by `CALL_START_TIME`, because for list partitioning you'll have to add partition for each second (or `CALL_START_TIME` is truncated to day, which will be very strange). – astentx Feb 09 '22 at 03:04

1 Answers1

0

You are grouping by a relatively complex function:

CASE WHEN (to_char(CALL_START_TIME , 'HH24:MI:SS') >= '00:00:00' AND to_char(CALL_START_TIME , 'HH24:MI:SS') <= '07:00:00') THEN 'Night'
WHEN (to_char(CALL_START_TIME , 'HH24:MI:SS') > '07:00:00' AND to_char(CALL_START_TIME , 'HH24:MI:SS') <= '23:59:59') THEN 'Day' END

If this particular query is important enough you could index this whole thing, but I suspect you would be better off getting the hour using extract.

CASE WHEN extract(hour from CALL_START_TIME) > 7 then 'Night' --midnight (inclusive) - 7am (exclusive)
     else 'Day' --7am (inclusive) - midnight (exclusive)
END

For your where clause I would I have no way to tell whether this will speed up your query. Or speed it up sufficiently for your use case. But you can try out the two queries and see if one of them is significantly faster than the other. Also, if you're frequently using times, it might make sense to index extract(hour from CALL_START_TIME), whereas an index on your entire case statement is only likely to get used in this one query.

Your question is how to speed up the query but when I have a query that takes hours to run I will often not bother to optimize it. It's long enough that you're unlikely to get the kind of speed-ups you need to have an application executing the query based on a user's request, and if a query takes 1 hour or 12, you likely need to plan when you run it.

pohart
  • 170
  • 2
  • 13
  • If you're going to try adding indexes on this, you can tryan index on CALL_START_TIME or (CALL_START_TIME, subs_key, extract(hour from CALL_START_TIME)). But unless this particular query is central to your business *and* you find that it brings a very significant speed-up it likely isn't worth the cost to speed up this one query – pohart Feb 07 '22 at 16:19
  • @pohat thanks, your function reduced time of execution, besides using your function I used hint with index: /*+ index (og X_FCT_USAGE_PREP_OGPRS_N)*/ as was suggested before, but cause of my situation I could not create index directly. So, in order to get all data faster, I did not use aggregates in query, aggregation was done in Python (this is for those, who stumble upon my question). – condexter Feb 10 '22 at 05:28