0

I have data in R schools which has 94 variables. Out of which I selected 3 variables in a set for analysis:

schools_set <- data.frame(schools$Schoolname, schools$SchoolGenderID, and School$)

The SchoolGenderID is sorted columnwise (1s and 2s) 1 for male and 2 for female. My question is, how can I replace these 1s and 2s with "Male" and "Female" respectively in the variable schools$SchoolGenderID within the same data frame?

gung - Reinstate Monica
  • 11,583
  • 7
  • 60
  • 79
Imad Ahmad
  • 31
  • 1
  • 5
  • 1
    Your code is incomplete. Can you clarify what you meant? In general, this isn't fully clear. Please add a [reproducible example](https://stackoverflow.com/q/5963269/1217536) for people to work with. – gung - Reinstate Monica Jul 23 '17 at 01:17
  • schools<- data.frame( read.csv("E:AnnualSchoolData.csv", header = TRUE)) schools$SchoolGenderId[schools$SchoolGenderId == "1"]<-"Male" schools$SchoolGenderId[schools$SchoolGenderId == "2"]<-"Female" schools_set<-data.frame(schools$SchoolName,schools$SchoolGenderId,schools$SchoolLevelId) This is the full code. retrieving data from .csv file – Imad Ahmad Jul 23 '17 at 09:54

3 Answers3

0
schools$SchoolGenderID[schools$SchoolGenderID == 1] <- "Male"
schools$SchoolGenderID[schools$SchoolGenderID == 2] <- "Female"

Or

schools$SchoolGenderID <- ifelse(schools$SchoolGenderID == 1, "Male", "Female")

Recommending the latter in this particular situation.

Odysseus210
  • 468
  • 3
  • 9
0

try the data.table way :)

schools = data.table (schools)

schools [ SchoolGenderID == "Male", "SchoolGenderID" := "1"]
schools [ SchoolGenderID == "Female", "SchoolGenderID" := "2"]
schools = schools [order (Schoolname, SchoolGenderID)]

The resultant gender column will not be numeric, but character type, as the initial column was character type. If you want a numeric column, then make new columns:

schools [ SchoolGenderID == "Male", "SchoolGenderNo" := 1]
schools [ SchoolGenderID == "Female", "SchoolGenderNo" := 2]
JVP
  • 309
  • 1
  • 11
  • No data.table() functions found. It is giving error – Imad Ahmad Jul 23 '17 at 09:35
  • @ImadAhmad update R if you haven't, and add "library (data.table)" somewhere in the beginning of your code – JVP Jul 23 '17 at 13:18
  • "Error in library(data.table) : there is no package called ‘data.table’". I downloaded. While loading with library(data.table) gives the above message. – Imad Ahmad Jul 23 '17 at 17:57
  • @ImadAhmad you need to install the package if you don't have it. Type `install.packages ("data.table")` and hit enter, then reload the package (i.e. type `library (data.table)` and hit enter. Perhaps read some introductory materials about packages in R – JVP Jul 23 '17 at 19:02
  • I tried install.packages("data.table") but it is still giving the error message. Though i tried with install.packages("data.table", depedencies = TRUE). – Imad Ahmad Jul 23 '17 at 19:04
  • @ImadAhmad it might help to restart R if you're having issues installing packages – JVP Jul 23 '17 at 19:08
0

Here's a way in dplyr

library(tidyverse)


schools_set <- schools %>%
                 select(Schoolname, SchoolGenderID) %>% # Make your subset 
                 mutate(
                   school_gender_id = ifelse(SchoolGenderID == 1, 
                                              "Male", "Female")
                 ) %>% 
                 mutate(
                   school_gender_id = as.factor(school_gender_id)
                 ) %>%
                 arrange(school_gender_id) # Order dataframe

mutate allows you to modify variables and make new ones. Arrange does the work of order. If the order is incorrect you can instead do: arrange(desc(school_gender_id)). A %>% is a known as a "pipe" and means "after doing this, move on to the next command."

I recommend this tutorial: http://r4ds.had.co.nz/introduction.html It was written by Hadley Wickham, the creator of R. Very complete and useful introduction to programming in R. It will be much easier and less ad-hoc if you learn tidyverse

Nick
  • 417
  • 4
  • 14