1

I am using one hot encoder for categorical variables. Since production dataset will have one row as input to the model there will always be a feature mismatch error which can be handled technique provided by @wen in:

How to one hot encode with pandas on a new dataset?

But how to handle if there is a new level of categorical variable which comes in production data. Ex. Earlier it was A, B, C & D. Now "E" is added. As if I use the above the technique my input will have five features and again there will be an error.

Can you please guide me on how to handle this new level of categorical scenario with a code sample.

Also how can I pickle/create a function to reuse the above mentioned technique in a web service.

kbsudhir
  • 415
  • 1
  • 6
  • 15

0 Answers0