0

My input csv file is already 1-hot encoded (its exported from another system):

id vehicle_1(car) vehicle_1(truck) vehicle_1(other)
1 0 1 0
2 1 0 0

Is there a way to tell pandas to treat the 'vehicle_' columns a 1-hot encoded? Perhaps during the construction of the dataframe? I'm assuming libraries like seaborn, which can plot data based on categories would need to know to treat the set of columns as 1-hot encoded values.

JoeT1492
  • 3
  • 1
  • Does this answer your question? [Reversing 'one-hot' encoding in Pandas](https://stackoverflow.com/questions/38334296/reversing-one-hot-encoding-in-pandas) – swiss_knight Feb 03 '21 at 21:18
  • `pd.wide_to_long` or `df.melt`... – Quang Hoang Feb 03 '21 at 21:19
  • Or `df.idxmax()`. – Quang Hoang Feb 03 '21 at 21:19
  • Thanks, I don't want to reverse the existing 1-hot encoding once imported. I simply want to "declare" to pandas that the columns are in fact already 1-hot encoded. I guess the real question is what are the implications for models if I 1-hot encode categorical features as part of pre-processing vs. importing them already 1-hot encoded. Any issues for downstream modeling etc, I need to be aware of? – JoeT1492 Feb 03 '21 at 21:47

1 Answers1

1

I don't think there's a way to tell pandas that the columns imported are already encoded (whichever it was used already before importing).

The advantage is you don't have to encode again.

The disadvantage is the imported DF treats your encoded columns as new columns rather than encoded values of the same column.

Dharman
  • 30,962
  • 25
  • 85
  • 135
Nag Gooty
  • 26
  • 1