0

I have a column that is left-skewed, I need to transform it. So I tried this

library(car)
vect<-c(1516201202, 1526238001, 1512050372, 1362933719, 1516342174, 1526502557 ,1523548827, 1512241202,1526417785, 1517846464)
powerTransform(vect)

The values in the vector are 13 digit numeric unix epoch timestamps like this I have few thousand values, pasting 10 of them here, I do the same operation on the entire column. This gave me an error

Error in qr.resid(xqr, w * fam(Y, lambda, j = TRUE, ...)) : NA/NaN/Inf in foreign function call (arg 5)

I was expecting transformed column back. Any Idea on how to do this in R?

Thanks Raj

Raj
  • 401
  • 6
  • 20
  • Please review what you can and should do in order to provide a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). In short, you need to provide minimal & representative sample data such we are able to reproduce the exact error when copy & pasting data & code. Otherwise debugging questions of the form "Why am I getting error XYZ" is a guessing game. That aside, I've posted a worked-through example below based on the `iris` sample dataset. Please take a look. – Maurits Evers Dec 10 '19 at 23:44
  • @MauritsEvers I have added reproducible example – Raj Dec 11 '19 at 00:07

1 Answers1

1

Generally, car::powerTransform returns a powerTransform object (which is a list containing amongst other things the estimated Box-Cox transformation parameter(s)). To get the transformed values, you need bcPower, which takes the car::powerTransform output object to transform the original data.

Unfortunately you don't provide sample data, so here's an example based on the iris dataset.

library(car)

# Box-Cox transformation of `Sepal.Length`
df <- iris
trans <- powerTransform(df$Sepal.Length)
# Or the same using formula syntax:
# trans <- powerTransform(Sepal.Length ~ 1, data = df)

# Add the transformed `Sepal.Length` data to the original `data.frame`
df <- cbind(
    df,
    Sepal.Length_trans = bcPower(
        with(iris, cbind(Sepal.Length)), coef(trans))[, 1])

# Show a histogram of the Box-Cox-transformed data    
library(ggplot2)
ggplot(df, aes(Sepal.Length_trans)) +
    geom_histogram(aes(Sepal.Length_trans), bins = 30)

enter image description here

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • What you have done is correct and that is what I want, but the problem is, as described in the question when I use the UNIX 13 digit epoch timestamp as the input I get the error mentioned. Thanks much for the reply! It is input data issue – Raj Dec 11 '19 at 00:09
  • @Raj Your added sample data don't make sense. All entries are identical. There's nothing to Box-Cox transform here. Please provide **representative** sample data! – Maurits Evers Dec 11 '19 at 00:12
  • Ok I fixed it. Could you please take a look – Raj Dec 11 '19 at 00:27
  • Please check now, I updated it again to produce the same exact error – Raj Dec 11 '19 at 00:33
  • 1
    @Raj I'm very confused about what you're trying to do. From what I understand, `vect` is a time-stamp. Provided it makes sense to transform time-stamp data to make it "look" normal (I'd be sceptical), it might make more sense to transform time-stamps to durations relative to a reference (time-stamp) first. In its current form you're running in numeric issues (try `hist(vect)`). – Maurits Evers Dec 11 '19 at 00:50
  • 1
    I think you found fundmental mistake of mine. Yes, I should not be transforming timestamp, unless I convert it to some other form. Thanks Much! – Raj Dec 11 '19 at 00:53