I am currently using the Johnson package to transform some data (e.g. percentage of incorrectly recalled items and similar).
(https://cran.r-project.org/web/packages/Johnson/Johnson.pdf)
The NE.Johnson function works fine for most of my data apart from two variables. When trying to run the function I receive the following error:
>recall_Johnson <- RE.Johnson(RECALL)
Error in RE.ADT(xsl[, i]) : object 'p' not found
I am really hoping you guys can give me a hand because I can't work out why I am getting the error!
I have visually searched the data and I can't find anything that really makes the two problem variables stand. Below is a print out of the sorted data for one variable that works fine (percentage of inhibition errors) and another which does not work (percentage of recall errors).
WORKS FINE: Inhibition Errors
> sort.x=sort(Inhibpercent)
> print(sort.x)
[1] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
[7] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
[13] 0.8403361 0.8474576 1.2605042 1.6666667 1.6666667 1.6666667
[19] 1.6666667 1.6666667 1.6666667 1.6666667 1.6806723 1.6949153
[25] 1.6949153 1.6949153 1.7021277 1.7241379 1.7241379 1.7543860
[31] 2.0833333 2.1008403 2.1276596 2.5000000 2.5000000 2.5104603
[37] 2.5423729 2.5531915 2.9166667 2.9166667 2.9166667 2.9166667
[43] 2.9411765 2.9535865 3.0042918 3.3333333 3.3333333 3.3333333
[49] 3.3333333 3.3613445 3.3755274 3.3755274 3.3898305 3.3898305
[55] 3.4188034 3.4482759 3.7500000 3.7656904 3.7815126 3.7815126
[61] 3.8297872 4.1666667 4.1841004 4.2016807 4.2194093 4.5833333
[67] 4.6025105 4.6511628 4.6808511 5.0000000 5.0000000 5.0000000
[73] 5.0209205 5.0847458 5.0847458 5.0847458 5.0847458 5.1282051
[79] 5.4393305 5.5319149 5.8333333 6.6666667 6.6666667 6.7796610
[85] 6.7796610 6.7796610 6.7796610 6.8965517 6.8965517 7.3033708
[91] 8.3333333 8.3333333 8.3333333 8.3333333 8.3333333 8.3333333
[97] 8.4388186 8.4745763 8.4745763 8.4745763 8.6206897 9.5833333
[103] 9.6638655 10.0000000 10.1694915 10.1694915 10.3448276 10.5042017
[109] 10.7142857 11.3924051 11.6666667 11.6666667 12.0833333 13.3333333
[115] 15.2542373 18.3333333 20.7547170 25.8536585
DOES NOT WORK: Recall Errors
> sort.x=sort(incorrectrecallpercent)
> print(sort.x)
[1] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
[7] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
[13] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
[19] 0.0000000 0.0000000 0.0000000 0.0000000 0.4166667 0.4166667
[25] 0.4201681 0.4201681 0.4201681 0.4219409 0.4255319 0.4255319
[31] 0.4878049 0.8333333 0.8368201 0.8403361 0.8403361 0.8438819
[37] 1.2500000 1.2500000 1.2500000 1.2605042 1.2875536 1.6666667
[43] 1.6666667 1.6666667 1.6666667 1.6666667 1.6666667 1.6736402
[49] 1.6949153 1.6949153 1.6949153 2.1186441 2.1186441 2.1186441
[55] 2.5000000 2.5000000 2.5104603 2.5210084 2.5210084 2.9166667
[61] 2.9787234 3.3333333 3.3333333 3.3898305 3.3898305 3.7500000
[67] 3.7735849 3.7974684 4.1666667 4.1841004 4.6025105 5.0000000
[73] 5.0000000 5.0847458 5.0847458 5.1724138 5.3571429 5.4621849
[79] 5.5084746 5.5555556 6.6666667 6.6666667 6.7226891 6.7796610
[85] 6.7796610 6.8965517 7.0175439 8.3333333 8.3333333 8.3333333
[91] 8.3333333 8.3333333 8.3333333 8.3333333 8.3333333 10.0000000
[97] 10.0000000 10.0000000 10.1694915 10.3448276 11.2359551 11.6666667
[103] 12.2807018 12.2881356 13.5593220 13.7931035 13.7931035 16.1016949
[109] 16.2790698 16.4556962 16.9491525 17.9487179 26.1603376 29.3103448
[115] 32.2033898 32.2033898 32.2033898 39.9159664
Apart from recall having slightly more instances of zero, I can't really see any substantial differences? But perhaps I am overlooking something?
I have tried following the guidance from this thread: Apply a loop to a data frame
It is suggested to modify this code (lines 101-103 in RE.Johnson.R):
if(xsb.valida[1,i]==0) xsb.adtest[1,i]<-(RE.ADT(xsb[,i])$p)
if(xsl.valida[1,i]==0) xsl.adtest[1,i]<-(RE.ADT(xsl[,i])$p)
if(xsu.valida[1,i]==0) xsu.adtest[1,i]<-(RE.ADT(xsu[,i])$p)
to:
if (xsb.valida[1, i] == 0 & any(xsb[, i]!=xsb[1, i])){
xsb.adtest[1, i] <- (RE.ADT(xsb[, i])$p)
}
else{ xsb.adtest[1, i] <- 0
}
if (xsl.valida[1, i] == 0 & any(xsl[, i]!=xsl[1, i])) {
xsl.adtest[1, i] <- (RE.ADT(xsl[, i])$p)
}
else{ xsl.adtest[1, i] <- 0
}
if (xsu.valida[1, i] == 0 & any(xsu[, i]!=xsu[1, i])) {
xsu.adtest[1, i] <- (RE.ADT(xsu[, i])$p)
}
else{xsu.adtest[1, i] <- 0
}
The reasoning behind the change is the same error can occur when the function tries to perform the transformation to a vector of equal values. I don't think that applies to my data because the values in the vector are not equal, but perhaps I have misunderstood? If the function breaks the vector into sections...perhaps this problem could occur because of the number of zeros in the beginning of the sorted list? I have tried changing the code, but unfortunately, it did solve the problem.
I see that the code below has been inserted to handle NA/NaN occurrences. It appears near the bottom of the script (lines 106-108 in NE.Johnson.R) and just under where I get the error (line 101 in NE.Johnson.R). Which perhaps could be relevant to the error I am getting?
# insertion by Boda Martin to handle with NA/NaN occurrences
xsb.adtest[which(is.na(xsb.adtest))] <- 0
xsl.adtest[which(is.na(xsl.adtest))] <- 0
xsu.adtest[which(is.na(xsu.adtest))] <- 0
Any help, guidance or gentle nudging in the correct direction would be much appreciated.
Here is the link to download the package if you want to see the code: https://cran.r-project.org/web/packages/Johnson/index.html
Many thanks, Rachel