I am trying to condense a table which contains multiple rows per event to a smaller table which contains counts of key sub-events within each event. Events are defined based on unique combinations across columns.
As a specific example, say I have the following data involving customer visits to various stores on different dates with different items purchased:
cust date store item_type
a 1 Main St 1
a 1 Main St 2
a 1 Main St 2
a 1 Main St 2
b 1 Main St 1
b 1 Main St 2
b 1 Main St 2
c 1 Main St 1
d 2 Elm St 1
d 2 Elm St 3
e 2 Main St 1
e 2 Main St 1
a 3 Main St 1
a 3 Main St 2
I would like to restructure the data to a table that contains a single line per customer visit on a given day, with appropriate counts. I am trying to understand how to use SQLite to condense this to:
Index cust date store n_items item1 item2 item3 item4
1 a 1 Main St 4 1 3 0 0
2 b 1 Main St 3 1 2 0 0
3 c 1 Main St 1 1 0 0 0
4 d 2 Elm St 2 1 0 1 0
5 e 2 Main St 2 2 0 0 0
6 a 3 Main St 2 1 1 0 0
I can do this in excel for this trivial example (begin with sumproduct( cutomer * date) as suggested here, followed by cumulative sum on this column to generate Index, then countif and countifs to generate desired counts).
Excel is poorly suited to doing this for thousands of rows, so I am looking for a solution using SQLite.
Sadly, my SQLite kung-fu is weak.
I think this is the closest I have found, but I am having trouble understanding exactly how to adapt it.
When I tried a more basic approach to begin by generating a unique index:
CREATE UNIQUE INDEX ui ON t(cust, date);
I get:
Error: indexed columns are not unique
I would greatly appreciate any help with where to start. Many thanks in advance!