Transform variables using dplyr in r

Question

I have the titanic dataset, and I want to make the variable suitable for SVM analysis.

> str(train)
'data.frame':   891 obs. of  12 variables:
 $ PassengerId: int  1 2 3 4 5 6 7 8 9 10 ...
 $ Survived   : int  0 1 1 1 0 0 0 0 1 1 ...
 $ Pclass     : int  3 1 3 1 3 3 1 3 3 2 ...
 $ Name       : chr  "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ...
 $ Sex        : chr  "male" "female" "female" "female" ...
 $ Age        : num  22 38 26 35 35 NA 54 2 27 14 ...
 $ SibSp      : int  1 1 0 1 0 0 0 3 0 1 ...
 $ Parch      : int  0 0 0 0 0 0 0 1 2 0 ...
 $ Ticket     : chr  "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
 $ Fare       : num  7.25 71.28 7.92 53.1 8.05 ...
 $ Cabin      : chr  "" "C85" "" "C123" ...
 $ Embarked   : chr  "S" "C" "S" "S" ...

I want to remove some of the variables, and also change chr variables as Sex and Embarked to factors.

This is what I have so far.

train <- train %>%
  dplyr::select(-1,-4,-9,-11) %>%
  mutate(Sex=recode(Sex, "male"=1, "female"=0)) %>%
  mutate(Embarked=recode(Embarked, "C"=1, "S"=0)) %>%
  na.omit()

Your question is too vague. Please tell us exactly which rows and columns you want to keep, and what format they need to be in. What error are you encountering. Please read this post on [How to ask Good Questions and compose Minimal Working Example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). — Vincent, Jan 19 '21 at 18:55

score 0 · Answer 1 · edited Jan 19 '21 at 19:23

Do you mean this kind of answer? getting a factor and recode?

library(titanic)
# titanic_train dataset
View(titanic_train)

train <- titanic_train %>%
  mutate_if(is.character, as.factor) %>% # all char to factor
  dplyr::select(-1,-4,-9,-11) %>% #removing columns
  mutate(Sex=recode(Sex, "male"="1", "female"="0"))%>% # recode factor
  mutate(Embarked=recode(Embarked, "C"="1", "S"="0")) %>% # recode factor, cave here are 4 levels
  na.omit()

Transform variables using dplyr in r

1 Answers1