2

I have a dataset of 6 column and 4.5 millions rows. I would like to write a logical check if in the fifth column there are values with zeroes, to put 1 in the sixth column. Could you explain me how to construct an algorithm to do this? In the fifth column I found that I have cells with zero value. I want to perform if in the fifth column I have zero values to put 1 in sixth column and if not to put 0? I must use data.table package. I try with this name of the data[,6] = ifelse(name of the data[,5] == 0, 1, name of the data[,6]).

Matt Dowle
  • 58,872
  • 22
  • 166
  • 224
ribery77
  • 141
  • 1
  • 3
  • 8
  • 2
    Please show a small reproducible example and the expected result based on that. For guidelines, check [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – akrun Jun 19 '15 at 17:51
  • When i use read.csv I had no problem. Due to the fact that I use data.table I am unable to use this command df[,6] = ifelse(df[,5] == 0, 1, df[,6]) – ribery77 Jun 19 '15 at 17:59

2 Answers2

6

Usingdata.table, we can use :=, which would be more efficient (example data from @plafort's post)

library(data.table)#v1.9.4+
setDT(df)[X5==0, X6:=1] 
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 2
    It works for me. It looks like the syntax is `setDT(df)[, ]`. And by extension, `[, , ]` – Pierre L Jun 19 '15 at 18:07
  • Hi guys it works. Thank you again for the support I really appreciate your help. I would like to thank you Akrun for help. – ribery77 Jun 20 '15 at 06:15
  • @plafort, that's a very nice way of explaining the general idea behind! – Arun Jun 20 '15 at 09:45
1

Here's a base R way:

df[,6][df[,5] == 0] <- 1

In many cases, you can avoid having to write explicit if statements. The conditional is implied in the subset. Reading it out would say, "In the sixth column of the data frame, assign the value of 1 to all values adjacent to column five." Someone more familiar with assigning column values in data.table can easily apply it to your case.

Data

set.seed(5)
df <- data.frame(replicate(6, sample(0:5, 3)))
df[2,5] <- 0
df
  X1 X2 X3 X4 X5 X6
1  1  1  3  0  1  1
2  3  0  4  1  0  5
3  4  2  5  4  5  3

df[,6][df[,5] == 0] <- 1

df
  X1 X2 X3 X4 X5 X6
1  1  1  3  0  1  1
2  3  0  4  1  0  1
3  4  2  5  4  5  3
Pierre L
  • 28,203
  • 6
  • 47
  • 69