0

I have a table in hive which is structured as below:

+------+------+--------+-------+
| col1 | col2 |  col3  | Value |
+------+------+--------+-------+
|    1 | A1   | M1     |    23 |
|    1 | A1   | M1_LW  |    25 |
|    1 | A1   | M1_L2W |    22 |
|    1 | A1   | M2     |    17 |
|    1 | A1   | M2_LW  |    21 |
|    1 | A1   | M2_L2W |    13 |
|    1 | A1   | M3     |    16 |
|    1 | A1   | M3_LW  |    30 |
|    1 | A1   | M3_L2W |    11 |
|    2 | A2   | M1     |    22 |
|    2 | A2   | M1_LW  |    22 |
|    2 | A2   | M1_L2W |    10 |
|    2 | A2   | M2     |    14 |
|    2 | A2   | M2_LW  |    25 |
|    2 | A2   | M2_L2W |    23 |
|    2 | A2   | M3     |    10 |
|    2 | A2   | M3_LW  |    20 |
|    2 | A2   | M3_L2W |    25 |
+------+------+--------+-------+

This structure suffices the need from a querying perspective, but for a specific reporting need, the table needs to be transformed to look as below:

+------+------+-------+----+----+----+
| col1 | col2 | col3  | M1 | M2 | M3 |
+------+------+-------+----+----+----+
|    1 | A1   | Today | 23 | 17 | 16 |
|    1 | A1   | LW    | 25 | 21 | 30 |
|    1 | A1   | L2W   | 22 | 13 | 11 |
|    2 | A2   | Today | 22 | 14 | 10 |
|    2 | A2   | LW    | 22 | 25 | 20 |
|    2 | A2   | L2W   | 10 | 23 | 25 |
+------+------+-------+----+----+----+

Can you please help with below using the inbuilt functions available in hive. I have tried with case for pivot and union on the result but it becomes a performance overhead doing so. Tried with the to_map UDAF but the hive version being used doesn't seem to support it. Any inputs will be greatly appreciated.

nehalkum
  • 11
  • 1
  • 4

1 Answers1

0

You can use case and conditional aggregation:

select col1, col2,
       (case when col3 like '%_LW' then 'LW'
             when col3 like '%_L2W' then 'L2W'
             else 'Today'
        end) as col3,
       max(case when col3 like 'M1%' then value end) as m1,
       max(case when col3 like 'M2%' then value end) as m2,
       max(case when col3 like 'M3%' then value end) as m3
from t
group by col1, col2,
         (case when col3 like '%_LW' then 'LW'
               when col3 like '%_L2W' then 'L2W'
               else 'Today'
          end)
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thanks for the answer, but this seems to break when either of M1, M2 or M3 contains any NULLs. Please suggest – nehalkum Mar 20 '18 at 12:22
  • @nehalkum . . . You should ask another question and be very clear on how to handle `NULL` values. Based on this question, it is not clear what should be done. And this question is already answered. – Gordon Linoff Mar 20 '18 at 12:23