0

Need last four months data:

select count(distinct session_id) 
from  master_gui partition for  (to_date('11-25-2020','MM-DD-YYYY')) 
where session_id in (select distinct session_id  
                    from reporting_data partition for (to_date('11-25-2020','MM-DD-YYYY')) 
where  flow_name in ('BEGIN_STATUS'));

any suggestion in above query how to include dates for last 4 months.

CHECKED FROM BELOW partition key value: SELECT OWNER, NAME, OBJECT_TYPE, COLUMN_NAME, COLUMN_POSITION FROM ALL_PART_KEY_COLUMNS

REPORTING_USER  REPORTING_DATA  TABLE   CREATE_TIME 1
REPORTING_USER  MASTER_GUI  TABLE   SESSION_START_TIME  1

using below query to get last 4 months records(Aug, Spet, Oct and Nov month)

select count(distinct session_id) 
from master_gui  where  SESSION_START_TIME >= add_months(trunc(sysdate), -4)
and  session_id in (select distinct session_id  from reporting_data where CREATE_TIME>= add_months(trunc(sysdate), -4)

and   flow_name in ('BEGIN_STATUS'));

Thanks Experts,

Used below query after changes, is it correct:

As we have to get count from master_gui table so used it and parent key value SESSION_START_TIME also reporting_data tab;e parent key value CREATE_TIME.

 select count(distinct session_id) 
from master_gui  where   SESSION_START_TIME < trunc(sysdate,'mm') 
and   SESSION_START_TIME >= add_months( trunc(sysdate, 'mm'),-4)
and  session_id in (select distinct session_id  from REPORTING_DATA where  create_time < trunc(sysdate,'mm') 
and   create_time >= add_months( trunc(sysdate, 'mm'),-4)
and   flow_name in ('BEGIN_STATUS'));

Thanks experts,

is below is correct will get some performance better by using below query, removed distinct clause from subquery inside.

 select count(distinct session_id) 
from master_gui  where   SESSION_START_TIME < trunc(sysdate,'mm') 
and   SESSION_START_TIME >= add_months( trunc(sysdate, 'mm'),-4)
and  session_id in (select session_id  from REPORTING_DATA where  create_time < trunc(sysdate,'mm') 
and   create_time >= add_months( trunc(sysdate, 'mm'),-4)
and   flow_name in ('BEGIN_STATUS'));

Thanks Experts,

I need to use in partition only to get faster perofmance:

select count(distinct session_id) 
from master_gui  partition for  (to_date('11-01-2020','MM-DD-YYYY')) 
where session_id in (select distinct session_id  from reporting_data  partition for  (to_date('11-30-2020','MM-DD-YYYY')) 
where  flow_name in ('BEGIN_STATUS'));

Is above query is correct for 1st Nov 2020 to 30th Nov 2020.

APC
  • 144,005
  • 19
  • 170
  • 281
singh_dba
  • 31
  • 4
  • Your question has changed from one of **business logic** to one about **query optimization**. You must understand, there's a limit to how much concrete advice we can give without more information. Please read this thread [on asking Oracle tuning questions](https://stackoverflow.com/a/34975420/146325). Then edit your question to provide more details. What interval are you using as the partition key? Daily? Weekly? Monthly? What proportion of hits have `flow_name` of BEGIN_STATUS ? – APC Dec 05 '20 at 08:06

2 Answers2

1

There's a DATE datatype column, I presume. If so, include it into the where clause, e.g.

... and date_column >= add_months(trunc(sysdate, 'mm'), -4)
Littlefoot
  • 131,892
  • 15
  • 35
  • 57
1

This part of your query means you are selecting records only from the partition which holds values for 25-NOV-2020.

from reporting_data partition for  (to_date('11-25-2020','MM-DD-YYYY')) 

Therefore if your table is partitioned by daily intervals you will get records only for the 25th. If the partition key is monthly you will get records only for November. Using this syntax you could only get records for the last four months if the partition key is (say) annual.

The solution is simply to omit the partition clause and use a WHERE clause instead.

select count(distinct session_id) 
from  master_gui 
where session_id in (select distinct session_id  
                     from reporting_data partition 
                     where <<partition_key_column>> >= sysdate - interval '4' month)
where  flow_name in ('BEGIN_STATUS')
and    <<partition_key_column>> >= sysdate - interval '4' month;

This query will still use partition pruning.


is it correct?

Looks like what I suggested. However, you have refined "last four months" to mean the last four complete months i.e. excluding the current month. My search criteria includes the current month. So maybe what you actually need is something like

select session_id  
from reporting_data 
where create_time < trunc(sysdate,'mm') 
and   create_time >= add_months( trunc(sysdate, 'mm'),-4)

This will provide a span from 01-AUG-2020 to 30-NOV-2020.

Incidentally, you don't need the DISTINCT in the subquery. The IN clause will handle duplicates so DISTINCT just adds unnecessary work, which could matter if you're dealing with large amounts of data.

APC
  • 144,005
  • 19
  • 170
  • 281
  • added above query with partition key value is it correct. – singh_dba Dec 04 '20 at 08:48
  • ADDED THE CHANGES IN ABOVE , kindly suggest is it correct. – singh_dba Dec 04 '20 at 10:01
  • IT'S STILL RUNNING, we need to distinct count from master_gui table and session id from reporting_data so i used parent key for both table in above code, is it syntatcially true – singh_dba Dec 04 '20 at 10:10
  • select count(distinct session_id) from master_gui where SESSION_START_TIME < trunc(sysdate,'mm') and SESSION_START_TIME >= add_months( trunc(sysdate, 'mm'),-4) and session_id in (select distinct session_id from REPORTING_DATA where create_time < trunc(sysdate,'mm') and create_time >= add_months( trunc(sysdate, 'mm'),-4) and flow_name in ('BEGIN_STATUS')); – singh_dba Dec 04 '20 at 10:10
  • suggested changes edited in query kindly confirm. – singh_dba Dec 04 '20 at 13:52
  • Have added the changes as i need to work with partition to get faster result any suggestion, kindly suggest. – singh_dba Dec 04 '20 at 17:25