-1

I have been stuck to get around this in Oracle SQL. I have a table that looks like following.

cust_id event event_date camp_id email_ind(calculated column)
1 OPENED 06/09/2021 80
1 SENT 06/09/2021 80 initial
1 SENT 14/09/2021 80 reminder
2 CLICK 15/04/2021 80
2 OPENED 15/04/2021 80
2 SENT 15/04/2021 80 initial
2 CONVERTED 15/04/2021 80
2 WEBSITE_VISIT 15/04/2021 80
3 OPENED 01/06/2021 80
3 SENT 01/06/2021 80 initial
3 OPENED 09/06/2021 80
3 SENT 09/06/2021 80 reminder
3 CLICK 10/06/2021 80
3 OPENED 10/06/2021 80
3 SENT 10/06/2021 80
3 CONVERTED 10/06/2021 80
3 WEBSITE_VISIT 10/06/2021 80
4 SENT 06/09/2021 80 initial
4 SENT 14/09/2021 80 reminder
5 OPENED 20/09/2021 80
5 SENT 20/09/2021 80 initial
5 SENT 28/09/2021 80 reminder
5 CLICK 03/10/2021 80
5 OPENED 03/10/2021 80
5 SENT 03/10/2021 80 not a reinder email
5 CONVERTED 03/10/2021 80
5 WEBSITE_VISIT 03/10/2021 80
5 OPENED 05/11/2021 80
6 OPENED 01-Jun-21 80
6 SENT 01-Jun-21 80 initial
6 OPENED 09-Jun-21 80
6 SENT 09-Jun-21 80 reminder
7 OPENED 26-Jul-21 80
7 SENT 26-Jul-21 80 initial
7 CLICK 03-Aug-21 80
7 OPENED 03-Aug-21 80
7 SENT 03-Aug-21 80 not a reminder email
7 CONVERTED 03-Aug-21 80
7 WEBSITE_VISIT 03-Aug-21 80
7 CLICK 04-Aug-21 80
7 OPENED 04-Aug-21 80
7 SENT 04-Aug-21 80
8 OPENED 12-Jul-21 80
8 SENT 12-Jul-21 80 initial
8 OPENED 20-Jul-21 80
8 SENT 20-Jul-21 80 reminder
9 SENT 29-Apr-21 80 initial
9 SENT 07-May-21 80 reminder

cust_id is customer_id ,event is the type of event (SENT - email was sent, OPENED - email was opened, CLICK- email was clicked, CONVERTED - customer accepted the offer) ,event_date is the timestamp of the event ,camp_id is the campaign_id.

Please note email_ind is not in the dataset. I basically want to know if the customer CONVERTED on the initial email or the reminder email that was SENT or there was no CONVERSION at all?

I want to create a new column called email_ind that tells if it is a initial email or a reminder email. Suppose, for cust_id=1, the initial email was sent on 06-Sept-21 and reminder email was sent 14-Sept-21 For cust_id=2, there was only initial email on 15-Apr-21 and then the cust_id CONVERTED on the same day. That means the customer accepted the offer on the initial email.

Also, when there is event='CONVERTED' there is automatically an event='SENT' is generated which should not be considered as a reminder email. For ex for cusT_id='3', the event 'CONVERTED' occured on 10-JUN-21 and automatically an event='SENT' is generated for 10-JUN-21 which should not be considered as reminder email.

Also, once the customer is CONVERTED, I do not want to consider the events that follow after conversion. For ex cust_id=7, it got converted on 03-Aug-21, so the rest of the vents on 04-Aug-21 can be ignored There are multiple camp_ids.

I tried self join and use windows function like over() partition by() but it doesn't seem to work. I must be missing something. It would be great if anyone could help here?

MT0
  • 143,790
  • 11
  • 59
  • 117
  • `I basically want to know if the customer CONVERTED on the initial email or the reminder email that was SENT or there was no CONVERSION at all` and `I want to create a new column called email_ind that tells if it is a initial email or a reminder email` sound like two separate questions; which one do you need help with? Also, does this need to have in consideration campaing_id i.e. if there's campaing_id = 81 the events reset? – Josh Part May 13 '22 at 18:36
  • The overall question is whether the customer converted on the initial email or the reminder email? Yes, the campaign id should be taken into consideration. The other campaigns would have similar events although the sequence of events can change. – CuriousQuest May 13 '22 at 20:26

1 Answers1

0

From Oracle 12, you can use MATCH_RECOGNIZE to do row-by-row processing:

SELECT cust_id,
       event,
       event_date,
       camp_id,
       CASE email_ind
       WHEN 'CONVERTED' THEN 'CONVERTED'
       WHEN 'INITIAL'   THEN 'INITIAL'
       WHEN 'REMINDER'  THEN 'REMINDER'
       END AS email_ind,
       CASE
       WHEN email_ind = 'CONVERTED'
       AND  has_reminder = 1
       THEN 'REMINDER'
       WHEN email_ind = 'CONVERTED'
       THEN 'INITIAL'
       END AS conversion_type
FROM   (SELECT t.*, ROWNUM AS rn FROM table_name t)
MATCH_RECOGNIZE (
  PARTITION BY cust_id
  ORDER     BY rn
  MEASURES
    CLASSIFIER() AS email_ind,
    COUNT(reminder.cust_id)  AS has_reminder
  ALL ROWS PER MATCH
  PATTERN ( ("INITIAL" not_sent*)? (reminder not_sent*)? not_reminder converted | other )
  DEFINE
    "INITIAL"    AS event = 'SENT',
    reminder     AS event = 'SENT',
    not_reminder AS event = 'SENT',
    not_sent     AS event <> 'SENT',
    converted    AS event = 'CONVERTED'
)

db<>fiddle here

MT0
  • 143,790
  • 11
  • 59
  • 117
  • Thank you for the comment. But the email_ind is not in the dataset. I just put in the table to give an example of initial and reminder. – CuriousQuest May 13 '22 at 20:29
  • @CuriousQuest Have you tried running the query or looking at the linked fiddle? It generates the `email_ind` field from the `MATCH_RECOGNIZE` and the pattern's `CLASSIFIER()`. – MT0 May 13 '22 at 20:31
  • I am getting this error- Missing IN or OUT parameter at index:: 1 when trying to run this query. My Oracle version is - Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production. – CuriousQuest May 14 '22 at 17:08
  • @CuriousQuest It works in the db<>fiddle linked in the answer in both Oracle 21 and Oracle 18. If you get an error then please provide a [MRE] with a way to replicate that error because the fiddles show that the query does not generate that error so you probably have something else that is generating the error. – MT0 May 14 '22 at 17:13
  • Thank you. It worked by removing the ? from the PATTERN. But I guess the query isn't working fully correct. In the fiddle output , the cust_id-2 should be CONVERTED on the INITIAL email but in the output it is null. – CuriousQuest May 14 '22 at 17:51
  • @CuriousQuest Don't remove the `?` make sure you escape it so it is not treated as an anonymous bind variable (or don't use a JDBC `PreparedStatement` and just use `Statement`). See [here](https://stackoverflow.com/a/71494842/1509264) – MT0 May 14 '22 at 17:56
  • Yes, by removing ?, it is not giving correct output. I tried searching on how to escape ? in oracle SQL developer but didn't find correct answer. It is taking ? as bind variable and trying to match the parameters that's passed. I am not using Java so I cant use a JDBC PreparedStatement or Statement. – CuriousQuest May 14 '22 at 19:11