1

I'm using the Aster Basket_Generator function to calculate a basket from a table of purchases (retail_purchases). I can create the basket without issue using the following code:

SELECT order_number, gsi_sku1, gsi_sku2, Count(1) 
FROM basket_generator(
    ON retail_purchases 
    PARTITION BY order_number 
    BASKET_SIZE(2) 
    BASKET_ITEM('gsi_sku')
    ACCUMULATE('order_number')
) 
WHERE gsi_sku1 in (11001788, 12002389) 
GROUP BY 1, 2, 3; 
LIMIT 10;

What I'd like to do additionally, is to calculate the average value of each basket. Ideally that would be returned to me as one column, but I'd be completely satisfied with the average sale price for each item in the basket.

I've tried the following:

SELECT order_number, gsi_sku1, gsi_sku2, avg(sales_amt), Count(1) 
FROM basket_generator(
    ON retail_purchases 
    PARTITION BY order_number 
    BASKET_SIZE(2) 
    BASKET_ITEM('gsi_sku')
    ACCUMULATE('order_number', 'sales_amt')
) 
WHERE gsi_sku1 in (11001788, 12002389) 
GROUP BY 1, 2, 3; 
LIMIT 10;

But the avg(sales_amt) column doesn't seem to be returning the correct value. What's the recommended way to calculate basket aggregates when using the Aster Basket_Generator analytic function?

topchef
  • 19,091
  • 9
  • 63
  • 102
TWAndrews
  • 113
  • 1
  • 4
  • @topchef, what is `sql-mr` and how is it relevant here? – Charles Oct 16 '13 at 16:44
  • Function _basket_generator_ is part of Aster Analytics Foundation, which is set of analytical functions included with Aster software. All of these functions are SQL-MR functions. SQL-MR is map-reduce framework in Aster using which one can implement and run map-reduce tasks (in Java or other languages) across nodes transparently and inside SQL code. – topchef Oct 16 '13 at 17:47

1 Answers1

0

From Aster documentation on basket_generator:

Columns in the ACCUMULATE clause should be a subset of the columns in the PARTITION BY clause. Otherwise the deterministic property is not guaranteed.

Thus, your approach doesn't hold as sales_amt is not part of PARTION BY (and rightly so since prices change).

One option is to place results of basket_generator into new table and then join it to your product catalog that contains prices - resulting table will contain prices to calculate average sale prices.

topchef
  • 19,091
  • 9
  • 63
  • 102