Replace the values of first 100 rows of a subsetted dataframe

Question

I have a dataframe like this

gender <- sample( c("M","F"), 10000, replace=TRUE, prob=c( 0.5, 0.5) )
handed <- sample( c("L","R"), 10000, replace=TRUE, prob=c( 0.2, 0.8) )
data <- data.frame(gender=gender,handed=handed)

I need to replace the first 100 rows which comes from this subset

d <- subset(data,gender=="M" & handed=="R",)

I know the

da <- head(d,n=100)

but I don't know how to replace them in data.

I want to replace it with:

gender=="F" & handed=="L

I have tried this solution:

Conditions_Seperator<-function(condition){return (unlist(strsplit(condition, "")))}
  con<-Conditions_Seperator("MR")
  replaceing_con<-Conditions_Seperator("FL")

      library(data.table)
      setDT(data)[data[,  .I[gender==as.character(con[1]) &  handed == as.character(con[2])][1:size_to_decrease]],
                  c('gender', 'handed') := .(as.character(replaceing_con[1]), as.character(replaceing_con[2]))][]

and the output is:

      handed
gender    L    R    M
     F 6122   95 3592
     M   96   95    0

it adds one column to my dataset

You can include that update by editing your post. Also include `set.seed` for a reproducible example. You should probably also make this data smaller, maybe 100 obs and replace the first 10 or something. — lmo, Mar 16 '17 at 12:12
Possible duplicate of http://stackoverflow.com/questions/12411231/r-replace-rows-in-a-data-frame-based-on-criteria — akrun, Mar 16 '17 at 12:51
Possible duplicate of [R Replace rows in a data frame based on criteria](https://stackoverflow.com/questions/12411231/r-replace-rows-in-a-data-frame-based-on-criteria) — , Jun 29 '18 at 16:10

Ronak Shah · Answer 1 · 2017-03-16T12:33:51.857

We can try by finding the position of first 100 "TRUE" values with which and head and use those indexes to subset rows in the original dataframe and update them by generating vector of values to be updated using rep.

subs <- head(which(data$gender=="M" & data$handed=="R"), 100)
data[subs, ] <- rep(c("F", "L"), each = 100)

Just for reproducibility, a small example with 10 rows and updating only 2 rows.

set.seed(24)
gender <- sample( c("M","F"), 10, replace=TRUE, prob=c( 0.5, 0.5) )
handed <- sample( c("L","R"), 10, replace=TRUE, prob=c( 0.2, 0.8) )
data <- data.frame(gender=gender,handed=handed)
data
#   gender handed
#1       F      R
#2       F      R
#3       M      R
#4       M      R
#5       M      R
#6       M      L
#7       F      R
#8       M      R
#9       M      R
#10      F      R

subs <- head(which(data$gender=="M" & data$handed=="R"), 2)
subs
#[1] 3 4
data[subs, ] <- rep(c("F", "L"), each = 2)
data
#   gender handed
#1       F      R
#2       F      R
#3       F      L
#4       F      L
#5       M      R
#6       M      L
#7       F      R
#8       M      R
#9       M      R
#10      F      R

score 2 · Accepted Answer · answered Mar 16 '17 at 12:23

Here is a second base R method that uses double subscripting.

Using ronak-shah's data, we start with

with(data, table(gender, handed))
      handed
gender L R
     F 0 4
     M 1 5

of each group, then use

data[data$gender=="M" & data$handed=="R",][1:2,] <- data.frame(gender="F", handed="L")

to replace the first two right-handed males with left-handed females. We end up with

with(data, table(gender, handed))
      handed
gender L R
     F 2 4
     M 1 3

akrun · Answer 3 · 2017-03-16T18:07:40.913

1

Here is one option using data.table

library(data.table)
setDT(data)[data[,  .I[gender=="M" &  handed == "R"][1:100]],
                        c('gender', 'handed') := .('F', 'L')][]
#       gender handed
#    1:      F      R
#    2:      F      R
#    3:      M      L
#    4:      F      L
#    5:      M      L
#   ---              
# 9996:      F      R
# 9997:      F      R
# 9998:      F      R
# 9999:      M      L
#10000:      M      R

For the updated question

setDT(data)[data[, .I[gender==con[1] & handed == con[2]][seq_len(size_to_decrease)]],
      c('gender', 'handed') := .(replaceing_con[1], replaceing_con[2])][]

table(data)
#      handed
#gender    L    R
#     F 1068 4075
#     M  986 3871

edited Mar 16 '17 at 18:07

answered Mar 16 '17 at 12:31

akrun

874,273
37
540
662

@user5363938 I am not getting any new column – akrun Mar 16 '17 at 17:24
I know ur solution works well, but i have updated my question, can u help me with this one? @akrun – user5363938 Mar 16 '17 at 17:59
`head(data[filter, which=TRUE], 100)` might be another way of selecting those rows. – Frank Mar 16 '17 at 18:01
2

@user5363938 It's the same as this, like `setDT(data)[some_rows, (some_cols) := some_values]`. Actually, I did not read the question, only the answers so far. Fyi, the etiquette here is that you shouldn't substantially modify your question after answers have been posted unless all answerers told you to go ahead with it. Otherwise, best to post a new question (after attempting the new problem yourself, of course). – Frank Mar 16 '17 at 18:04

Replace the values of first 100 rows of a subsetted dataframe

3 Answers3