1

If i have 2 columns with sketches made on the same datatype, can i merge the sketches in both columns to get a final sketch corresponding to each row in the dataset in Google BigQuery?

col1 | col2 | sketchA|sketchB
_______________________________
c1  | c2    |sketch1 | sketch5
c3  |  c4   |sketch2 | null
c3  | c5    |sketch3 | null
c3  |c6     |null.   |sketch6
c2  |c6     |sketch7 |  sketch4
c5  |c9     |null    |sketch8

is it possible to merge sketch columns => sketchA, sketchB ? such that the final output is

col1 | col2 | sketchMerge
___________________
c1  | c2    |merge(sketch5, sketch1)
c3  |  c4   |sketch2
c3  | c5    |sketch3
c3  |c6     |sketch6
c2  |c6     |merge(sketch7, sketch4)
c5  |c9     |sketch8

1 Answers1

1

Yes, you can. Use below approach (BigQuery Standard SQL)

SELECT *, (
  SELECT HLL_COUNT.MERGE_PARTIAL(sketch)
  FROM UNNEST([sketchA, sketchB]) sketch
) AS sketchMerge
FROM your_table

If apply to dummy data similar to yours in question - output is

enter image description here

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • Thanks @Mikhail This solution works. I just want to still understand how the subquery is working, ` SELECT HLL_COUNT.MERGE_PARTIAL(sketch) FROM UNNEST([sketchA, sketchB]) sketch`. my understanding is that the merge works across column. Here it is working across each row on UNNEST([sketchA, sketchB]). – Chhavi Bansal Jul 26 '23 at 11:23
  • 1
    I understood how this query is working, basically, [sketchA, sketchB] are first made into an array, with unnest these are basically exploded as separate rows. and then merge operation still works column wise to produce a single output per column. hence sketchA, sketchB being in 1 column leads to union of both those sketches. – Chhavi Bansal Aug 03 '23 at 18:51