0

This is my first post :) Thank you so much in advance.

I am trying to simulate data in R I already have simulated my data set. However, now I need to create new variables that meets a condition for example:

dataTFULL2$RANDOM100[dataTFULL2$Variable1-dataTFULL2$Variable2 > 0] <- 1
dataTFULL2$RANDOM100[dataTFULL2$Variable1-dataTFULL2$Variable2 < 0] <- 0

With that code I can create the variable that meets the condition for 100% and 0% of the cases. But I need to do it in the same way for the 95%, 90%, 85%, 80%....5%.

I am stuck with this but there must be a way to compute that condition to be meet in a specific % of the cases.

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
JuanJMV
  • 127
  • 6
  • 1
    Please edit as stated [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – NelsonGon Jul 23 '20 at 10:20
  • This is confusing. I don't know what your variables mean and I don't know just what you are trying to do. I could maybe guess, but why should that be required? You are immersed in the particulars of your problem so you know what you are talking about. All we have to go on are two context-free lines of code involving an unknown dataframe and a few vague sentences. When writing a question, it is best to assume that your readers don't already know what you are trying to do. – John Coleman Jul 23 '20 at 11:16

1 Answers1

0

As the commenters on the original post, I am not quite certain I understand the problem correctly.

As far as I can see, you created a new variable (RANDOM100), which is 1 if Variable1 is higher than Variable2 and 0 otherwise. This is correct in 100% of the cases. Now you want to add errors, so it is only correct for fewer cases (e.g. 90%).

If this is what you try to do, the easiest way would be to create a vector with random distribution and use it to flip the values at a certain number of indices:

noise = runif(length(dataTFULL2$RANDOM100),0,1)
percentage = 0.90
dataTFULL2$RANDOM100[noise>precentage] = 1-dataTFULL2$RANDOM100[noise>precentage]

This code creates a vector with random values between 0 and 1. If the value is above a specific limit (e.g. 0.90), the value of your variable is flipped (1 becomes 0, 0 becomes 1).

Is this what you tried to do?

Martin Wettstein
  • 2,771
  • 2
  • 9
  • 15
  • That is exactly what I needed. Thank you so much! Just one more thing is there any way to do it without replacing the original variable? – JuanJMV Jul 27 '20 at 13:32
  • Yes, of course. You can give it any name you want. You can copy your variable to a new one (e.g. `dataTFULL2$RANDOM90=dataTFULL2$RANDOM100`) and then do the third line in the answer above for variable `dataTFULL2$RANDOM90`. – Martin Wettstein Jul 27 '20 at 15:33
  • Thank you so much for your help! – JuanJMV Jul 30 '20 at 19:55