-1

I'm needing to transform some data to have a specific mean and sd. I'm working off of this question except that I need my final answer to be positive, as in greater than 0.

https://stats.stackexchange.com/questions/46429/transform-data-to-desired-mean-and-standard-deviation

Does anyone have any idea? Thank you.

Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63
user3367135
  • 131
  • 2
  • 12
  • 1
    Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269) . This will make it much easier for others to help you. – Jaap Oct 04 '16 at 06:01

1 Answers1

0
y <- rnorm(1000, mean = 10, sd = 3) # creates normal distributed random data
mean_target <- 5 #desired mean/sd of the data
sd_target   <- 1
y2 <- mean_target + (y - mean(y)) * sd_target/sd(y) #according to the given formula following the link you provided
print(y2)

If you have a problem with negative values than you could cut the values lower than zero to zero. This will of course change the mean and sd slightly then.

y2[y2 < 0] <- 0

It is not possible (for all positive mean and sd) to apply these specific values and keep all values positive for sure. So the only way I can think of is to manipulate the outliers.

Rereading your question let me think that you maybe want some iterative approach to force the desired mean and sd. Assuming you want to throw away the outliers (smaller than zeros), the following approach may help. But be warned that it may change your data significantly!

applyMeanSD <- function(y, mean_target, sd_target, max_iter = 10){
    iter <- 0
    while(any(y < 0) || iter < max_iter){
        iter <- iter + 1
        y <- y[y > 0] #throws away all outliers
        if (length(y) > 1)
            y <- mean_target + (y - mean(y)) * sd_target/sd(y)
        else 
            return (NULL)
    }
    return(y)
}

test2 <- applyMeanSD(test <- rnorm(100, 0, 1), 1, 0.5)
test #negative values included
test2 #no negative values
mean(test2)
sd(test2)
DavideChicco.it
  • 3,318
  • 13
  • 56
  • 84
Phann
  • 1,283
  • 16
  • 25
  • This works on the test data but doesn't work on the data that I'm running (unsure why). I may have to transform that data set to normalize it before I can run this. – user3367135 Oct 04 '16 at 21:40
  • As I don't even have a clue of your data, I can't help you here. Maybe check for ``any(is.na(mydata))``, ``all(is.finite(mydata))`` or ``all(is.numeric(mydata))`` and so on. – Phann Oct 05 '16 at 06:35