7

Say have a Data.Frame object in R where all the character columns have been transformed to factors. I need to then "modify" the value associated with a certain row in the dataframe -- but keep it encoded as a factor. I first need to extract a single row, so here is what I'm doing. Here is a reproducible example

a = c("ab", "ba", "ca")
b = c("ab", "dd", "da")
c = c("cd", "fa", "op")
data = data.frame(a,b,c, row.names = c("row1", "row2", "row3")
colnames(data) <- c("col1", "col2", "col3")
data[,"col1"] <- as.factor(data[,"col1"])
newdat <- data["row1",]
newdat["col1"] <- "ca"

When I assign "ca" to newdat["col1"] the Factor object associated with that column in data was overwritten by the string "ca". This is not the intended behavior. Instead, I want to modify the numeric value that encodes which level is present in newdat. so I want to change the contents of newdat["col1"] as follows:

Before:

Factor object, levels = c("ab", "ba", "ca"): 1 (the value it had)

After:

Factor object, levels = c("ab", "ba", "ca"): 3 (the value associated with the level "ca")

How can I accomplish this?

Quantumpencil
  • 175
  • 11
  • You can call the `factor` again to include te new level and then assign – akrun Oct 22 '15 at 17:17
  • It already includes the level I want to change it to, but the above statement changes the field column on the 15th row so that its data_type is no longer factor, even though new_val is part of the levels set – Quantumpencil Oct 22 '15 at 17:19
  • Have you tried `dataframe[15,'field'] <- new_val` (not tested without a reproducible example) – akrun Oct 22 '15 at 17:21
  • 4
    Please add a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) to your question. – Richard Erickson Oct 22 '15 at 17:23
  • Updating original question w/ more detail – Quantumpencil Oct 22 '15 at 17:25
  • Please use quotes `c('ab', 'bc',..` Also, some of the syntax are not correct `data.names(c(..`. Are you using python? – akrun Oct 22 '15 at 17:38
  • downvote seems a little harsh here -- I thought the original post was clear enough and that the issue was a standard R operation I wasn't familiar with (since I'm a python engineer who has to work with R randomly due to a consultant) and couldn't find it in the docs on the factor class. I've edited the post with a more detailed example – Quantumpencil Oct 22 '15 at 17:40
  • yes, I basically have to run someones r code inside a python api through rpy2 – Quantumpencil Oct 22 '15 at 17:41
  • You should edit your question to clarify that you're using python and rpy2, not r – C_Z_ Oct 22 '15 at 17:43
  • edited, but rpy2 is basically just running this code in r anyway. I reproduced my original issue in R (well, rstudio) so I could check the data types that the code I was running in python has -- and they change when I attempt this, from factor to chr – Quantumpencil Oct 22 '15 at 17:55

1 Answers1

3

What you are doing is equivalent to:

x = factor(letters[1:4]) #factor
x1 = x[1] #factor; subset of 'x'
x1 = "c" #assign new value

i.e. assign a new object to an existing symbol. In your example, you, just, replace the "factor" of newdat["col1"] with "ca". Instead, to subassign to a factor (subassigning wit a non-level results in NA), you could use

x = factor(letters[1:4])
x1 = x[1]
x1[1] = "c"  #factor; subset of 'x' with the 3rd level

And in your example (I use local to avoid changing newdat again and again for the below):

str(newdat)
#'data.frame':   1 obs. of  3 variables:
# $ col1: Factor w/ 3 levels "ab","ba","ca": 1
# $ col2: Factor w/ 3 levels "ab","da","dd": 1
# $ col3: Factor w/ 3 levels "cd","fa","op": 1
local({ newdat["col1"] = "ca"; str(newdat) })
#'data.frame':   1 obs. of  3 variables:
# $ col1: chr "ca"
# $ col2: Factor w/ 3 levels "ab","da","dd": 1
# $ col3: Factor w/ 3 levels "cd","fa","op": 1    
local({ newdat[1, "col1"] = "ca"; str(newdat) })
#'data.frame':   1 obs. of  3 variables:
# $ col1: Factor w/ 3 levels "ab","ba","ca": 3
# $ col2: Factor w/ 3 levels "ab","da","dd": 1
# $ col3: Factor w/ 3 levels "cd","fa","op": 1
local({ newdat[["col1"]][1] = "ca"; str(newdat) })
#'data.frame':   1 obs. of  3 variables:
# $ col1: Factor w/ 3 levels "ab","ba","ca": 3
# $ col2: Factor w/ 3 levels "ab","da","dd": 1
# $ col3: Factor w/ 3 levels "cd","fa","op": 1
alexis_laz
  • 12,884
  • 4
  • 27
  • 37