0

I like to make a histogram of some data that is saved in a nested BigQuery table. In a simplified manner the table can be created in the following way:

CREATE TEMP TABLE Sessions (
    id int,
    hits 
      ARRAY<
        STRUCT<
          pagename STRING>>
);

INSERT INTO Sessions (id, hits)
VALUES 
  ( 1,[STRUCT('A'),STRUCT('A'),STRUCT('A')]),
  ( 2,[STRUCT('A')]),
  ( 3,[STRUCT('A'),STRUCT('A')]),
  ( 4,[STRUCT('A'),STRUCT('A')]),
  ( 5,[]),
  ( 6,[STRUCT('A')]),
  ( 7,[]),
  ( 8,[STRUCT('A')]),
  ( 9,[STRUCT('A')]),
  (10,[STRUCT('A'),STRUCT('A')]);

and it looks like

id hits.pagename
1 A
A
A
2 A
3 A
A

and so on for the other ids. My goal is to obtain a histogram showing the distribution of A-occurences per id in data studio. The report for the MWE can be seen here: link

So far I created a calculated field called pageviews that evaluates the wanted occurences for each session via SUM(IF(hits.pagename="A",1,0)). Looking at a table showing id and pageviews I get the expected result:

table showing the number of occurences of page A for each id

However, the output of the calculated field is a metric, which might cause trouble for the following. In the next step I wanted to follow the procedure presented in this post. Therefore, I created another field bin to assign my sessions to bins according to the homepageviews as:

CASE
    WHEN pageviews = 0 OR pageviews = 1 THEN "bin 1"
    WHEN pageviews = 2 OR pageviews = 3 THEN "bin 2"
END

According to this bin-defintion I hope to obtain a histogram having 6 counts in bin 1 and 4 counts in bin 2. Well, in this particular example it will actually have 4 counts in bin one as ids 5 and 7 had "null" entries, but never mind. This won't happen in my real world table.

As you can see in the next image showing the same table as above, but now with a bin-column, this assignment works as well - each id is assigned the correct bin, but now the output field is a metric of type text. Therefore, the bar-chart won't let me use it (it needs it as dimension).

Assignment of each id to a bin

Somewhere I read the workaround to create a selfjoined blend, which outputs metrics as dimension. This works only by name: my field is now a dimension and i can use it as such for the bar-chart, but the bar chart won't load and shows a configuration error of the data source, which can be seen in this picture:

bar-chart of id-count over bin. In the configuration of the chart one can see that "bin" is now a dimension. The chart won't plot, however, as it shows a data configuration error (sorry for the German version of data studio).

1 Answers1

0

Using this custom query as my data source in Looker Studio:

SELECT *
FROM
(
  SELECT 1 AS id, ['A','A','A'] AS hits
  UNION ALL
  SELECT 2 AS id, ['A'] AS hits
  UNION ALL
  SELECT 3 AS id, ['A','A'] AS hits
  UNION ALL
  SELECT 4 AS id, ['A','A'] AS hits
  UNION ALL
  SELECT 5 AS id, [] AS hits
  UNION ALL
  SELECT 6 AS id, ['A'] AS hits
  UNION ALL
  SELECT 7 AS id, [] AS hits
  UNION ALL
  SELECT 8 AS id, ['A'] AS hits
  UNION ALL
  SELECT 9 AS id, ['A'] AS hits
  UNION ALL
  SELECT 10 AS id, ['A','A'] AS hits
)

Create a simple data blend from that data source, with id as the Dimension and a pageviews metric, using the same formula you used. (I think at the time you posted this question, simple blends like this were not possible. I believe they used to require a minimum of 2 sources which is probably why you mentioned a self-join)

Screenshot of the Data Blend (notice that no join is required)

The formula for the pageviews metric: SUM(IF(hits='A',1,0))

The output of the data blend looks like this:

| id | pageviews |
| -- | --------- |
| 1  | 3         |
| 2  | 1         |
| 3  | 2         |
| 4  | 2         |
| 5  | 0         |
| 6  | 1         |
| 7  | 0         |
| 8  | 1         |
| 9  | 1         |
| 10 | 2         |

Now, you can add a bar chart to your Report using this blend as the data source. The dimension should be a calculated field with the following formula (or some variation of it):

CASE 
  WHEN pageviews <= 1 THEN 'Bin 1'
  WHEN pageviews <= 3 THEN 'Bin 2'
END

For the Metric, select id and use the Count Distinct aggregation.

The final bar chart looks like this:

Bar Chart (with SETUP pane visible)