0

Now I'm using the query below in hive to split a row into multiple rows, but I also want to group a "Product" column based on "Category" column each group will match by the order of the group and have ";" to sperate each group and have "," separate item in the group.

SELECT id, customer, prodcut_split  
FROM orders lateral view explode(split(product,';')) products AS prodcut_split

Here is my data look like now

| id |  Customer| Category                  | Product                           |
+----+----------+---------------------------+-----------------------------------+
| 1  |  John    | Furniture; Technology     | Bookcases, Chairs; Phones, Laptop |
| 2  |  Bob     | Office supplies; Furniture| Paper, Blinders; Tables           |
| 3  |  Dylan   | Furniture                 | Tables, Chairs, Bookcases         |

my desired result will look like:

| id |  Customer| Category       | Product   |
+----+----------+----------------+-----------+
| 1  |  John    | Furniture      | Bookcases |
| 1  |  John    | Furniture      | Chairs    |
| 1  |  John    | Technology     | Phones    |
| 1  |  John    | Technology     | Laptop    |
| 2  |  Bob     | Office supplies| Paper     |
| 2  |  Bob     | Office supplies| Blinders  |
| 2  |  Bob     | Furniture      | Tables    |
| 3  |  Dylan   | Furniture      | Tables    |
| 3  |  Dylan   | Furniture      | Chairs    |
| 3  |  Dylan   | Furniture      | Bookcases |
Jobin
  • 5,610
  • 5
  • 38
  • 53
Louhzer
  • 33
  • 4

1 Answers1

0

I have tried this one and it's work well, all credit goes to this question: Hive - Split delimited columns over multiple rows, select based on position

select id,customer ,category, products
  from 
  (
   SELECT id, category, product
     FROM tale_name 
     lateral VIEW posexplode(split(category,';')) category AS pos_category, category_split
     lateral VIEW posexplode(split(product,';')) product AS pos_product, product_split
  WHERE pos_category = pos_product) a
  lateral view explode(split(product_split,',')) product_split AS products
Louhzer
  • 33
  • 4