R code to create new variables based on certain conditions

Question

I have column A with values. I want to divide the values from column A by 2 or 3 IF they meet the following conditions:

Condition 1: If Column B and Column C = 250, then divide Column A by 2; Condition 2: If column C and Column D = 250, then divide Column A by 2; Condition 3: If Column B and Column C and Column D = 250, then divide Column A by 3

Condition 4: If Column B and Column C = 500, then divide Column A by 2; Condition 2: If column C and Column D = 500, then divide Column A by 2; Condition 3: If Column B and Column C and Column D = 500, then divide Column A by 3

and so on....

In other words, if two columns (from B, C and D) have the same 2 values, divide by 2 or if three columns have the same 3 values, divide by 3.

As an example the data is:

   A            B      C     D
    0.666667    250    500  250
    0.666667    500    500  1000
    0.666667    250    1000 1000
    0.666667    500    500  1000
    0.666667    250    500  500
    0.666667    250    500  500

As for the counts here is what I would get after the first part of the condition for example in row 1 there are 2 - 250 and 1 - 500 hence 2,1,2 corresponding to columns B1, C1, D1:

A           B     C      D     B1    C1    D1
0.666667    250   500   250    2    1       2
0.666667    500   500   1000   2    2       1
0.666667    250   1000  1000   1    2       2
0.666667    500   500   1000   2    2       1
0.666667    250   500   500    1    2       2
0.666667    250   500   500    1    2       2

I now need to divide column A by B1, A by C1, A by D1 to give me three new columns AR, BR, CR

   A        B   C      D    BR      CR       DR
0.666667    250 500  250    0.333   0.667   0.333
0.666667    500 500  1000   0.333   0.333   0.667
0.666667    250 1000 1000   0.667   0.333   0.333
0.666667    500 500  1000   0.333   0.333   0.667
0.666667    250 500  500    0.667   0.333   0.333
0.666667    250 500  500    0.667   0.333   0.333
0.666667    500 500  500    0.222   0.222   0.222

I am still trying to work out the code.

data %>% mutate(A1 == ifelse(B == 250 & C == 250, A/2, ifelse(B == 250 & D == 250, A/2, ifelse(B == 250 & C == 250 & D == 250, A/3))
data %>% mutate(A1 == ifelse(B == 500 & C == 500 , A/2, ifelse(B == 500 & D == 500 , A/2, ifelse(B == 500 & C == 500 & D == 500, A/3))
data %>% mutate(A1 == ifelse(B == 1000 & C == 1000 , A/2, ifelse(B == 1000 & D == 1000 , A/2, ifelse(B == 1000 & C == 1000 & D == 1000, A/3))

I get a + asking for more code.

Any help would be most appreciated. Thank you!

Use `dput(data)`, `data<-data.frame(A=c(...),B=c(...))`, or read.table(text="...") to make a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) — CrunchyTopping, Jul 14 '19 at 15:27
I'm sorry, I'm pretty new to R, I'm not sure I understand what you mean? I've tried to produce a part of the data in a table format, and calculated manually some of the columns I need. — EleMan, Jul 14 '19 at 15:47
Reproducible in large part means being able to copy and paste a minimal example of not only the code that's giving you a problem, but also the function that reads the data that code is using into R, or one of R's built in dataframes. — CrunchyTopping, Jul 14 '19 at 15:53

CrunchyTopping · Answer 1 · 2019-07-15T12:22:23.857

0

Try this: Make a column s.c that's the number of times you see that particular number for each combination of row (A) and that number (value). Then you can divide 2/3 by each of those number of occurences and reshape the data back to original wide format.

library(reshape) #melt and cast are from here
data<-read.table(text="
A            B      C     D
0.666667    250    500  250
0.666667    500    500  1000
0.666667    250    1000 1000
0.666667    500    500  1000
0.666667    250    500  500
0.666667    250    500  500
",header=T)
data$A<-letters[1:nrow(data)]
data<-melt(data,id.vars  = "A")
combo_c<-do.call(paste,data[c("value","A")])
data$s.c<-as.integer(ave(as.character(data$variable), combo_c, FUN=function(combo_c) length(unique(combo_c))))
data$R<-(2/3)/data$s.c
data<-melt(data,measure.vars = c("value","s.c","R"))
names(data)[names(data)=="variable"]<-c("v1","v2")
cast(data,A~v2+v1)

Here's the output:

> cast(data,A~v2+v1)
  A value_B value_C value_D s.c_B s.c_C s.c_D       R_B       R_C       R_D
1 a     250     500     250     2     1     2 0.3333333 0.6666667 0.3333333
2 b     500     500    1000     2     2     1 0.3333333 0.3333333 0.6666667
3 c     250    1000    1000     1     2     2 0.6666667 0.3333333 0.3333333
4 d     500     500    1000     2     2     1 0.3333333 0.3333333 0.6666667
5 e     250     500     500     1     2     2 0.6666667 0.3333333 0.3333333
6 f     250     500     500     1     2     2 0.6666667 0.3333333 0.3333333

edited Jul 15 '19 at 12:22

answered Jul 14 '19 at 15:41

CrunchyTopping

803
7
17

Thank you for your explanation. I'm still not getting the results I need. The ifelse arguements/statements need to come together. What happens now is I get a 0 returned for everything else. If the conditions are met, columns B and D should become 2 (because there are two 250 values) and column C should become 1 (since there is only one 500 value). Now, column A needs to be divided these new values to form three new columns divided by 2, 1 and 2. Sorry, hope this makes sense. Thanks. – EleMan Jul 14 '19 at 17:01
The 0 is because of the 0 in ` ifelse(B == 250 & C == 250 & D == 250, A/3, 0))))`. Make the other two look like the above and put what you want it to return instead of the 0, which is just a dummy for you to replace. – CrunchyTopping Jul 14 '19 at 18:23
Is the last obs `0.666667 250 500 500` or `0.666667 500 500 500`? – CrunchyTopping Jul 14 '19 at 19:20
Thank you. It seems like you have managed to get an output which looks right, but when I get to the last command, "cast(data,A~v2+v1)", I get the following error message: Error in cast(data, weight ~ v2 + v1) : could not find function "cast". So I am not sure why that error would come up since library(reshape2) ran. In answer to your last question about the last obs - yes it is 0.666667 500 500 500, I included that in the last table just to show some variability in the results. – EleMan Jul 15 '19 at 03:03
Do `install.packages("reshape")` then `library(reshape)`. – CrunchyTopping Jul 15 '19 at 12:23
Thank you again for your reply. I did install.packages("reshape"), but here are the error messages: The downloaded binary packages are in :\Users\USER\AppData\Local\Temp\RtmpITMTsh\downloaded_packages > library(reshape) > data$A<-letters[1:nrow(data)] Error in 1:nrow(data) : argument of length 0 > data<-melt(data,id.vars = "A") > combo_c<-do.call(paste,data[c("value","A")]) Error in `[.data.frame`(data, c("value", "A")) : undefined columns selected > data$s.c<-as.integer(ave(as.character(data$variable), combo_c, FUN=function(combo_c) length(unique(combo_c)))) – EleMan Jul 15 '19 at 14:08
And the rest of the error; Error in interaction(...) : object 'combo_c' not found > data$R<-(2/3)/data$s.c > data<-melt(data,measure.vars = c("value","s.c","R")) Error: measure variables not found in data: s.c > names(data)[names(data)=="variable"]<-c("v1","v2") > cast(data,A~v2+v1) Error in xj[i] : object of type 'closure' is not subsettable – EleMan Jul 15 '19 at 14:12
`data$A<-letters[1:nrow(data)] Error in 1:nrow(data) : argument of length 0` appears if `data` is not there. Do you have a dataframe called `data`? – CrunchyTopping Jul 15 '19 at 14:36
Dataframe? I am so sorry, I am pretty lost and confused...my first line of code is: data<-read.csv("Data1R.csv") - my data file is called Data1R.csv - does this automatically become a dataframe or do I need to make one? I'm so sorry about these questions. – EleMan Jul 15 '19 at 15:45
Yea that’s probably right if Data1R.csv is in your “working directory”. Can you run `dput(data)` and edit your original question with its output? You should start with some basic r tutorials plus searching online for the answers to the problems you’re having in these comments. – CrunchyTopping Jul 15 '19 at 17:49

R code to create new variables based on certain conditions

1 Answers1