0

I am working with data in a tabular format where I would like each row to be one individual. Currently duplicates at the individual level occur because there are duplicates in another column for each individual.

Example Input Table:

+-------------+--------+
| EMPLOYEE_ID | COLORS |
+-------------+--------+
|      111111 | BLUE   |
|      222222 | GREEN  |
|      333333 | RED    |
|      333333 | GREEN  |
+-------------+--------+

Example Desired Output Table:

+-------------+---------+---------+
| EMPLOYEE_ID | COLOR_1 | COLOR_2 |
+-------------+---------+---------+
|      111111 | BLUE    |         |
|      222222 | GREEN   |         |
|      333333 | RED     | GREEN   |
+-------------+---------+---------+

The number of duplicates is variable (i.e. there could be a COLOR_3, COLOR_4, etc.).

Please let me know; I have tried using pivot_table but seem to be running into the issue of the data I am trying to pivot (COLORS in the original table) being categorical rather than numerical.

Thanks!

Arjun Arun
  • 313
  • 2
  • 9

0 Answers0