I like to make a histogram of some data that is saved in a nested BigQuery table. In a simplified manner the table can be created in the following way:
CREATE TEMP TABLE Sessions (
id int,
hits
ARRAY<
STRUCT<
pagename STRING>>
);
INSERT INTO Sessions (id, hits)
VALUES
( 1,[STRUCT('A'),STRUCT('A'),STRUCT('A')]),
( 2,[STRUCT('A')]),
( 3,[STRUCT('A'),STRUCT('A')]),
( 4,[STRUCT('A'),STRUCT('A')]),
( 5,[]),
( 6,[STRUCT('A')]),
( 7,[]),
( 8,[STRUCT('A')]),
( 9,[STRUCT('A')]),
(10,[STRUCT('A'),STRUCT('A')]);
and it looks like
id | hits.pagename |
---|---|
1 | A |
A | |
A | |
2 | A |
3 | A |
A |
and so on for the other ids. My goal is to obtain a histogram showing the distribution of A-occurences per id in data studio. The report for the MWE can be seen here: link
So far I created a calculated field called pageviews that evaluates the wanted occurences for each session via SUM(IF(hits.pagename="A",1,0))
. Looking at a table showing id and pageviews I get the expected result:
table showing the number of occurences of page A for each id
However, the output of the calculated field is a metric, which might cause trouble for the following. In the next step I wanted to follow the procedure presented in this post. Therefore, I created another field bin to assign my sessions to bins according to the homepageviews as:
CASE
WHEN pageviews = 0 OR pageviews = 1 THEN "bin 1"
WHEN pageviews = 2 OR pageviews = 3 THEN "bin 2"
END
According to this bin-defintion I hope to obtain a histogram having 6 counts in bin 1 and 4 counts in bin 2. Well, in this particular example it will actually have 4 counts in bin one as ids 5 and 7 had "null" entries, but never mind. This won't happen in my real world table.
As you can see in the next image showing the same table as above, but now with a bin-column, this assignment works as well - each id is assigned the correct bin, but now the output field is a metric of type text. Therefore, the bar-chart won't let me use it (it needs it as dimension).
Assignment of each id to a bin
Somewhere I read the workaround to create a selfjoined blend, which outputs metrics as dimension. This works only by name: my field is now a dimension and i can use it as such for the bar-chart, but the bar chart won't load and shows a configuration error of the data source, which can be seen in this picture: