0

Separate the given sample into 2 subsamples: one, for which the residuals are below zero and another, for which they are above zero. Create variable Unscrambled.Selection.Sequence estimating switching between the two subsamples (1 corresponds to the positive residual case and 0 corresponds to the negative residual case).

head(Unscrambled.Selection.Sequence,30)
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
##  0  0  1  0  1  1  1  0  0  1  1  1  0  0  0  0  1  0  0  1  0  0  0  0  1  1 
## 27 28 29 30 
##  0  0  0  0

and my data is:

dput(head(Unscrambled.Selection.Sequence, 30))
c(`1` = 1, `2` = 1, `3` = 0, `4` = 0, `5` = 1, `6` = 1, `7` = 0, 
`8` = 1, `9` = 0, `10` = 0, `11` = 0, `12` = 0, `13` = 1, `14` = 0, 
`15` = 0, `16` = 0, `17` = 0, `18` = 0, `19` = 0, `20` = 0, `21` = 1, 
`22` = 0, `23` = 0, `24` = 0, `25` = 1, `26` = 1, `27` = 0, `28` = 0, 
`29` = 1, `30` = 1)

but if I do this way, only "FALSE" showed up instead of 0 or 1, so anyone know to make a group of residuals values to 0 when they are below 0. Thank you!

Phil
  • 7,287
  • 3
  • 36
  • 66
Alex
  • 13
  • 5
  • Paste your code into the question, not a picture. Use `dput(head(Unscrambled.Selection.Sequence, 30))` and paste the results into your question. You can always convert a logical variable to numeric with `as.numeric()`. – dcarlson Oct 18 '21 at 22:59
  • got it, thank you, but – Alex Oct 19 '21 at 00:16
  • Hi @Alex, did you figure out an answer to your question? – Skaqqs Oct 19 '21 at 00:24
  • I'm simplifying your data frame name. Your two samples are `USS.Low <- USS[USS <= 0]` and `USS.High <- USS[USS > 0]`. This will work directly on the residuals. No need to recode to 0, 1. – dcarlson Oct 19 '21 at 02:45
  • @Skaqqs I think ifelse(Estimated.Residuals > 0, 1, 0) works, thank you! – Alex Oct 19 '21 at 15:31
  • @dcarlson yeah, but I need 0 and 1 values in the data frame. – Alex Oct 19 '21 at 15:31

2 Answers2

0

Please do not post code or data as images. It's easier to help you if you include a simple reproducible example with sample input and desired output that can be used to test and verify possible solutions (for example, with dput(). See the link for ways to improve your question. How to make a great R reproducible example

I think this will help:

# Example data
Estimated.Residuals <- runif(n=10, min = -1, max = 1)
Estimated.Residuals 
#>  [1]  0.24216058 -0.24466652 -0.05917005  0.50122727 -0.72685828  0.96479633
#>  [7] -0.90828290  0.84974910 -0.75215365 -0.27893627

# Which values are positive?
Estimated.Residuals > 0
#>  [1]  TRUE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE

# Returns values that are TRUE (i.e. positive values)
Estimated.Residuals[Estimated.Residuals > 0]
#> [1] 0.2421606 0.5012273 0.9647963 0.8497491

# Returns mean of values that are TRUE
mean(Estimated.Residuals[Estimated.Residuals > 0])
#> [1] 0.6394833

# Return value based on logic
ifelse(Estimated.Residuals > 0, 1, 0)
#>  [1] 0 1 1 1 1 1 0 0 1 0

Created on 2021-10-18 by the reprex package (v2.0.1)
Skaqqs
  • 4,010
  • 1
  • 7
  • 21
0

Here is an example using data that is included with R:

data(iris)
# Predict Sepal.Length from Sepal.Width, Petal.Length, and Petal.Width
iris.lm <- lm(Sepal.Length~., iris[, -5])
hilow <- ifelse(residuals(iris.lm) < 0, 0, 1)
table(hilow)
hilow
#  0  1 
# 71 79 
plot(iris$Sepal.Length, residuals(iris.lm), pch= 16, col=hilow+2)
abline(h=0, lty=2)

Plot

dcarlson
  • 10,936
  • 2
  • 15
  • 18