I have created a Spark DataFrame, using python, and I want to get a mapping for the categorical variables. In other words, I would like to achieve the same results as if I run df['name-of-variable'].cat.codes
on python pandas. Is there a way to achieve this?
Asked
Active
Viewed 232 times
0

Dimitris Poulopoulos
- 1,139
- 2
- 15
- 36
-
How did you define your categorical variables ? – eliasah Jun 29 '17 at 09:52
-
I have user and item IDs that both consists of alphanumerical values. Thus, I treat them as categorical variables. – Dimitris Poulopoulos Jun 29 '17 at 10:04
-
you can probably use `StringIndexer()`, but neverhteless u should share example data and expected output at least. – mtoto Jun 29 '17 at 10:56
-
I have probably answer a question like this before using StringIndexer – eliasah Jun 29 '17 at 11:46