I have a huge database and I am having many categorical variables. You can watch it here:
> M=data.frame(Type_peau,PEAU_CORPS,SENSIBILITE,IMPERFECTIONS,BRILLANCE ,GRAIN_PEAU,RIDES_VISAGE,ALLERGIES,MAINS,
+ INTERET_ALIM_NATURELLE,INTERET_ORIGINE_GEO,INTERET_VACANCES,INTERET_COMPOSITION,DataQuest1,Priorite2,
+ Priorite1,DataQuest4,Age,Nbre_gift,w,Nbre_achat)
> # pour voir s'il y a des données manquantes
> str(M)
'data.frame': 836 obs. of 21 variables:
$ Type_peau : Factor w/ 5 levels "","Grasse","Mixte",..: 3 4 5 3 4 3 3 3 2 3 ...
$ PEAU_CORPS : Factor w/ 4 levels "","Normale","Sèche",..: 2 3 3 2 2 2 3 2 3 2 ...
$ SENSIBILITE : Factor w/ 4 levels "","Aucune","Fréquente",..: 4 4 4 2 4 3 4 2 4 4 ...
$ IMPERFECTIONS : Factor w/ 4 levels "","Fréquente",..: 3 4 3 4 3 2 3 4 3 3 ...
$ BRILLANCE : Factor w/ 4 levels "","Aucune","Partout",..: 4 2 2 4 4 4 4 4 3 4 ...
$ GRAIN_PEAU : Factor w/ 4 levels "","Dilaté","Fin",..: 4 4 4 2 4 2 4 4 2 4 ...
$ RIDES_VISAGE : Factor w/ 4 levels "","Aucune","Très visibles",..: 2 2 2 4 4 2 4 2 4 2 ...
$ ALLERGIES : Factor w/ 4 levels "","Non","Oui",..: 2 2 2 2 2 2 2 2 2 2 ...
$ MAINS : Factor w/ 4 levels "","Moites","Normales",..: 3 4 4 3 3 3 3 4 4 4 ...
$ INTERET_ALIM_NATURELLE: Factor w/ 4 levels "","Beaucoup",..: 2 4 4 4 2 2 2 4 4 2 ...
$ INTERET_ORIGINE_GEO : Factor w/ 5 levels "","Beaucoup",..: 2 4 2 5 2 2 2 2 2 2 ...
$ INTERET_VACANCES : Factor w/ 6 levels "","À la mer",..: 3 4 2 2 3 2 3 2 3 2 ...
$ INTERET_COMPOSITION : Factor w/ 4 levels "","Beaucoup",..: 2 2 2 4 2 2 2 2 4 2 ...
$ DataQuest1 : Factor w/ 4 levels "-20","20-30",..: 4 3 4 4 4 3 3 2 3 2 ...
$ Priorite2 : Factor w/ 7 levels "éclatante","hydratée",..: 3 1 3 4 3 2 7 1 4 6 ...
$ Priorite1 : Factor w/ 7 levels "éclatante","hydratée",..: 4 6 1 5 1 6 1 2 6 4 ...
$ DataQuest4 : Factor w/ 2 levels "nature","urbain": 2 2 2 2 2 1 2 2 2 2 ...
$ Age : int 32 37 23 44 33 30 43 43 60 31 ...
$ Nbre_gift : int 1 4 1 1 2 1 1 1 1 1 ...
$ w : num 0.25 0.25 0.5 0.25 0.5 0 0 0 0 0.75 ...
$ Nbre_achat : int 3 4 7 3 6 9 22 13 7 16 ...
I need to convert all categorical variables to numeric automatically. For example for the variable Type_peau, it is :
head(Type_peau)
[1] Mixte Normale Sèche Mixte Normale Mixte
Levels: Grasse Mixte Normale Sèche
I want it :
head(Type_peau)
[1] 2 3 4 2 3 2
Levels: 1 2 3 4
How can I do that automatically for all categorical variables?