5

Input table

Patients  Hospital   Drug   Response
1         AAA        a      Good
1         AAA        a      Bad
2         BBB        a      Bad
3         CCC        b      Good
4         CCC        c      Bad
5         DDD        e      undefined 

Output file

Patients  Hospital   Drug   Response
1         AAA        a      1
1         AAA        a      -1
2         BBB        a      -1
3         CCC        b      1
4         CCC        c      -1
5         DDD        e       

How to replace 3 texts in one column to number and blank?

"good in Reponse column" to "1" "bad in Reponse column" to "-1" "undefined in Reponse column" to " "

Data:

structure(list(Patients = c(1L, 1L, 2L, 3L, 4L, 5L), Hospital = structure(c(1L, 
1L, 2L, 3L, 3L, 4L), .Label = c("AAA", "BBB", "CCC", "DDD"), class = "factor"), 
    Drug = structure(c(1L, 1L, 1L, 2L, 3L, 4L), .Label = c("a", 
    "b", "c", "e"), class = "factor"), Response = structure(c(2L, 
    1L, 1L, 2L, 1L, 3L), .Label = c("Bad", "Good", "undefined"
    ), class = "factor")), .Names = c("Patients", "Hospital", 
"Drug", "Response"), class = "data.frame", row.names = c(NA, 
-6L))
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
Catherine
  • 5,345
  • 11
  • 30
  • 28

4 Answers4

17

You can do this with one line by changing the labels of the factor Response:

> within(df, Response <- factor(Response, labels = c(-1, 1, "")))
  Patients Hospital Drug Response
1        1      AAA    a        1
2        1      AAA    a       -1
3        2      BBB    a       -1
4        3      CCC    b        1
5        4      CCC    c       -1
6        5      DDD    e         
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • 1
    +1 nice! Of course, `Response` has to be a factor (which it probably is). – csgillespie Apr 10 '11 at 16:25
  • 1
    alternatively, `sapply(Data$Response,switch,'Good'=1,'Bad'=-1,'undefined'="")` is a more general way of doing that, but it is definitely slower. Plus, when using switch, Data$Response has to be a character vector, otherwise you get faulty results. – Joris Meys Apr 10 '11 at 21:09
  • How do you do this without overwriting the original data frame? – stackoverflowuser2010 Feb 27 '14 at 00:03
  • @stackoverflowuser2010 (So *now* you want my R advice? :-) If you want to create the vector named `Response` in the output from `within()`, simply `factor(df$Response, labels = c(-1, 1, ""))` will do it. If you want a new data frame then just assign the result of `within()` to a new object `df2 <- within(df, Response <- factor(Response, labels = c(-1, 1, "")))`. `within()` (and the similar `transform()`) are convenience functions for interactive use they just return a modified data frame so assigning the result to something other than the data frame used will result in a new data frame. – Gavin Simpson Feb 27 '14 at 02:01
  • @GavinSimpson Im a litte confused with your answer of creating a vector named Response in the output. I want a new column(vector) in my dataframe that is now coded.`within(df, obs <- factor(df$Response, labels = c(1,0,-1)))` it does NOT add a new perm. column. it produces a temporary list. so you still have to write a new object or write over the old object. Or am I missing something? – Kerry Apr 01 '14 at 18:23
  • @Kerry `within` returns a modified data frame, you need to assign it, either to a new object or to the existing object name. This is R - most things don't get modified without assigning them. Also note that you don't need `factor(df$Response, ....)` but just `factor(Response, ....)` because `Response` is made visible by `within`. – Gavin Simpson Apr 01 '14 at 23:06
5

Catherine, your questions could still be answered by a very basic textbook in R. Please see Dirk's comment in your previous question.

Answer

If d is your data frame, then:

d[d$Response == "Good",]$Response = 1
d[d$Response == "Bad",]$Response = -1
d[d$Response == "undefined",]$Response = ""

I'm guessing (I may be wrong) that "Undefined" is missing data. In which case, use NA rather than a blank. Any basic R book will describe NA's

Community
  • 1
  • 1
csgillespie
  • 59,189
  • 14
  • 150
  • 185
3

If your data is in a data frame df

df$Response[df$Response == "Good"] <- 1
df$Response[df$Response == "Bad"] <- -1
df$Response[df$Response == "undefined"] <- ""
Noah
  • 2,574
  • 1
  • 18
  • 12
2

You can use a simple ifelse() statement.

cath <- data.frame(nmbrs = runif(10), words = sample(c("good", "bad"), 10, replace = TRUE))
cath$words <- ifelse(cath$words == "good", 1, ifelse(cath$words == "bad", -1, ""))
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197