1

I am analyzing user patterns in Snowflake for an ecommerce website. I would like to be able to match various patterns of a user flow (e.g. did they complete an order after viewing a specific page? did they complete an order after selecting an add to cart from a particular portion of the page? etc.).

Is it possible to calculate conversion rates for multiple patterns with a match recognize function?

The data structure looks something like this:

CREATE TEMPORARY TABLE AS events_and_visits (
    VISIT_ID bigint,
    EVENT_ID bigint,
    EVENT_NAME VARCHAR,
    REFERENCE VARCHAR
);

INSERT INTO events_and_visits VALUES 
 (1, 1, 'productView', 'reco'),
 (1, 2, 'Add To Cart', 'reco'),
 (1, 3, 'Order Complete', NULL),
 (2, 4, 'productView', 'reco'),
 (3, 5, 'productView', 'reco'),
 (3, 6, 'Add To Cart', 'merchant'),
 (4, 7, 'productView', 'reco'),
 (4, 8, 'productView', 'reco'),
 (4, 9, 'Add To Cart', 'merchant'),
 (4, 10, 'Order Complete', NULL);

My failed attempt

SELECT * 
FROM
    events_and_visits MATCH_RECOGNIZE(
        PARTITION BY visit_id
        ORDER BY
            event_id 
        MEASURES 
            match_number() AS match_number,
            classifier() AS clf 
            ALL ROWS PER MATCH WITH UNMATCHED ROWS 
        PATTERN (
                (product_rec_view + atc_merchant * | atc_rec *) * oc * --THIS IS SO F***** 
            ) 
        DEFINE
            product_rec_view AS (
                event_name = 'productView'
                AND reference = 'reco'
            ),
            atc_rec AS (
                event_name = 'Add To Cart'
                AND reference = 'reco'
            ),
            atc_merchant AS (
                event_name = 'Add To Cart'
                AND reference = 'merchant'
            ),
            oc AS event_name = 'Order Complete'
    );  
user2529589
  • 330
  • 4
  • 16
  • 1
    does this other question on multiple funnels help? https://stackoverflow.com/questions/71065504/snowflake-match-recognize-function-to-create-step-funnel – Simeon Pilgrim Mar 19 '22 at 04:52

1 Answers1

2

The data insert can be made simple (and thus only need one execution):

INSERT INTO events_and_visits VALUES 
 (1, 1, 'productView', 'reco'),
 (1, 2, 'Add To Cart', 'reco'),
 (1, 3, 'Order Complete', NULL),
 (2, 4, 'productView', 'reco'),
 (3, 5, 'productView', 'reco'),
 (3, 6, 'Add To Cart', 'merchant'),
 (4, 7, 'productView', 'reco'),
 (4, 8, 'productView', 'reco'),
 (4, 9, 'Add To Cart', 'merchant'),
 (4, 10, 'Order Complete', NULL);

both (product_rec_view+ (atc_merchant | atc_rec))? oc* or product_rec_view+ (atc_merchant | atc_rec) oc? gives me what I feel like you are wanting, but it's hard to fully understand you intent

SELECT * 
FROM
    events_and_visits MATCH_RECOGNIZE(
        PARTITION BY visit_id
        ORDER BY
            event_id 
        MEASURES
            MATCH_SEQUENCE_NUMBER() AS mseq, 
            match_number() AS match_number,
            classifier() AS clf 
            ALL ROWS PER MATCH WITH UNMATCHED ROWS 
        PATTERN (
                (product_rec_view+ (atc_merchant | atc_rec))? oc* 
                --product_rec_view+ (atc_merchant | atc_rec) oc?
            ) 
        DEFINE
            product_rec_view AS (
                event_name = 'productView' AND reference = 'reco'
            ),
            atc_rec AS (
                event_name = 'Add To Cart' AND reference = 'reco'
            ),
            atc_merchant AS (
                event_name = 'Add To Cart' AND reference = 'merchant'
            ),
            oc AS event_name = 'Order Complete'
    )
ORDER BY 1,2;  
VISIT_ID EVENT_ID EVENT_NAME REFERENCE MSEQ MATCH_NUMBER CLF
1 1 productView reco 1 1 PRODUCT_REC_VIEW
1 2 Add To Cart reco 2 1 ATC_REC
1 3 Order Complete 3 1 OC
2 4 productView reco 1 1
3 5 productView reco 1 1 PRODUCT_REC_VIEW
3 6 Add To Cart merchant 2 1 ATC_MERCHANT
4 7 productView reco 1 1 PRODUCT_REC_VIEW
4 8 productView reco 2 1 PRODUCT_REC_VIEW
4 9 Add To Cart merchant 3 1 ATC_MERCHANT
4 10 Order Complete 4 1 OC
Simeon Pilgrim
  • 22,906
  • 3
  • 32
  • 45