With 40+ columns, it may be a lot of work to implement the logic using a CASE WHEN eg.
CASE
WHEN typeA > typeB AND typeA > typeC THEN 'typeA'
...etc
END
Another approach may be to let the database do the work with ranking and agreggrate functions as shown below. I've unpivoted the data with the first CTE all_data
and added a meta column. You may also explore this other amazon athena approach if desired. I then continued to rank the data to find which type was the largest. I then determined whether there were multiple types ranked the same similar to row 3
where typeA
and typeB
had the same value. A case statement then decided whether I would use the actual type or can't decide
. With the classification for each row id, I then used an inner join to add this new column to the final projection. See sample below:
WITH all_data AS (
SELECT id, typeA as val, 'typeA' as type from my_table UNION ALL
SELECT id, typeB as val, 'typeB' as type from my_table UNION ALL
SELECT id, typeC as val, 'typeC' as type from my_table
),
ranked_data AS (
SELECT
id,
val,
type,
rank() over (partition by id order by val desc) as rnk
FROM
all_data
),
classification AS (
SELECT
id,
CASE
WHEN COUNT(id) > 1 THEN 'undecided'
ELSE MAX(type)
END as class_type
FROM
ranked_data
WHERE rnk=1
GROUP BY id
)
SELECT
m.*,
c.class_type as classification
FROM
my_table m
INNER JOIN
classification c ON c.id = m.id