1

I'm trying to apply the function RE.Johnson from the Johnson package to a whole data frame df that contains 157 observations of 16 variables and i'd like to loop trough all the dataframe instead of doing it manually. I've tried the following code but it doesn't work.

lapply(df[1:16], function(x) RE.Johnson(x))

I know it might seem easy for you guys but I'm juste starting with R. Thanks

EDIT

R provides me the answer Error in RE.ADT(xsl[, i]) : object 'p' not found and the data are not transformed. And here is a summary of the data:

data.frame':    157 obs. of  16 variables:
$ X         : num  786988 781045 777589 775266 786843 ...
$ Y         : num  486608 488691 490089 489293 488068 ...
$ Z         : num  182 128 191 80 131 ...
$ pH        : num  7.93 7.69 7.49 7.66 7.92 7.08 7.24 7.19 7.44 7.37 ...
$ CE        : num  0.775 3.284 3.745 4.072 0.95 ...
$ Nitrate   : int  21 14 18 83 30 42 47 101 85 15 ...
$ NP        : num  19.6 43.6 31.7 18.6 31.7 ...
$ Cl        : num  1.9 21.3 2.56 21.5 3.2 ...
$ HCO3      : num  6.65 4.85 4.4 7.72 4.1 ...
$ CO3       : num  0 0 0 0 0.0736 ...
$ Ca        : num  4.12 7.52 3.48 7.58 4.8 10 4.4 4.6 4.2 7.4 ...
$ Mg        : num  3.94 8.92 2.34 7.1 2.5 ...
$ K         : num  0.1442 0.0759 0.0709 0.3691 0.07 ...
$ Na        : num  2.41 34.55 2.51 44.01 2.1 ...
$ SO4       : num  1.45 23.6 1.2 26.66 2 ...
$ Residu_sec: num  0.496 2.102 2.397 2.606 0.608 ...
A. Bohyn
  • 64
  • 9
  • 2
    this is vague: " but it doesn't work." Please explain further. Is there an error? does the output differ from what is expected? Welcome to StackOverflow. Please take the time to read this post on [how to provide a great R example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – lmo Jan 07 '17 at 18:35
  • With a sample of data and the error that it gives you, it would be easier to help you. – PereG Jan 07 '17 at 18:37
  • type `dput(head(df, 20))` in console, then paste the result in here to provide us with sample data – Jake Kaupp Jan 07 '17 at 18:40
  • if the input is a data frame and the operations are on the column try: df[ , 1:16] – Dave2e Jan 07 '17 at 18:41

2 Answers2

1

Not a complete solution, just some information for others.

I tried the Johnson::RE.Johnson manually on the columns in the iris data frame. It seems to be work fine for Sepal.Length and Petal.Length only:

lapply(iris[c(1,3)], Johnson::RE.Johnson)

... and it returns the error you mentioned for Sepal.Width and Petal.Width.

lapply(iris[c(2,4)], Johnson::RE.Johnson)

Error in RE.ADT(xsl[, i]) : object 'p' not found

This seems odd because all of those columns have a data type of num. The iris data frame doesn't appear to have any missing values or extra character values hidden anywhere, so I'm not sure why the calculation is working for those columns but not others.

Without understanding too much about what the Johnson::RE.Johnson is doing to the data, it looks like it is unable to calculate a value for p and is unable to complete the iteration for those columns.

From exploring the source code, the function appears to break down at this point:

  if (xsb.valida[1, i] == 0) 
    xsb.adtest[1, i] <- (Johnson::RE.ADT(xsb[, i])$p) # succeeds
  if (xsl.valida[1, i] == 0) 
    xsl.adtest[1, i] <- (Johnson::RE.ADT(xsl[, i])$p) # fails
  if (xsu.valida[1, i] == 0) 
    xsu.adtest[1, i] <- (Johnson::RE.ADT(xsu[, i])$p) # fails

The function attempts to run Johnson::RE.ADT on xsl, which at this point is a vector of just 0's. The RE.ADT returns the same error with the p value not being found.

Oliver Frost
  • 827
  • 5
  • 18
  • Yes, i've ran the same test for my data and some columns are problematic but others not... I don't know why – A. Bohyn Jan 07 '17 at 18:59
  • Did some exploratory work on the source code for you, see above for the edit. Hopefully it will be enough for someone else to find out what's going on. – Oliver Frost Jan 07 '17 at 19:19
0

The problem is when the function try to perform the Anderson-Darling test to a vector of equals values. If you do this, you will get the error:

require(Johnson)
x = rep(1,n=100)
RE.ADT(x)

So, to solve this problem you could check it in the IF session inside the function RE.Johnson:

    if (xsb.valida[1, i] == 0 & any(xsb[, i]!=xsb[1, i])){
        xsb.adtest[1, i] <- (RE.ADT(xsb[, i])$p)
    }else{
        xsb.adtest[1, i] <- 0
    }   
    if (xsl.valida[1, i] == 0 & any(xsl[, i]!=xsl[1, i])) {
        xsl.adtest[1, i] <- (RE.ADT(xsl[, i])$p)
    }else{
        xsl.adtest[1, i] <- 0
    }
    if (xsu.valida[1, i] == 0 & any(xsu[, i]!=xsu[1, i])) {
        xsu.adtest[1, i] <- (RE.ADT(xsu[, i])$p)
    }else{
        xsu.adtest[1, i] <- 0
    }