1

In R, I compare groups with the dunn.test. Here is some example data, where "type" is the grouping variable:

my_table <- data.frame ("type" = c (rep ("low", 5), rep ("mid", 5), rep ("high", 5)),
                        "var_A" = rnorm (15),
                        "var_B" = c (rnorm (5), rnorm (5, 4, 0.1), rnorm (5, 12, 2)) 
                        )

I want to compare the variables var_A and var_B among the three groups with the dunn.test (), which puts out the following results:

library (dunn.test)
dunn.test (my_table$var_A, my_table$type)
>  Kruskal-Wallis rank sum test
>
> data: x and group
> Kruskal-Wallis chi-squared = 6.08, df = 2, p-value = 0.05
>
>
> Comparison of x by group                            
> (No adjustment)                                
> Col Mean-|
> Row Mean |       high        low
> ---------+----------------------
>      low |   0.919238
>          |     0.1790
>          |
>      mid |   0.989949   0.070710
>          |     0.1611     0.4718
>
> alpha = 0.05
> Reject Ho if p <= alpha/2

and

dunn.test (my_table$var_B, my_table$type)
> Kruskal-Wallis rank sum test
>
> data: x and group
> Kruskal-Wallis chi-squared = 12.5, df = 2, p-value = 0
>
>
> Comparison of x by group                            
> (No adjustment)                                
> Col Mean-|
> Row Mean |       high        low
> ---------+----------------------
>      low |   3.535533
>          |    0.0002*
>          |
>      mid |   1.767766  -1.767766
>          |     0.0385     0.0385
>
> alpha = 0.05
> Reject Ho if p <= alpha/2

I understand that for var_A, I cannot see any significant differences between the three groups. For var_B, the groups "low" and "high" differ significantly. When presenting the results, I could choose a table like

library (tidyverse)
data.frame ("low" = my_table %>%
                filter (type == "low") %>%
                select (c ("var_A", "var_B")) %>%
                sapply (mean) %>%
                round (digits = 2),
            "mid" = my_table %>%
                filter (type == "mid") %>%
                select (c ("var_A", "var_B")) %>%
                sapply (mean) %>%
                round (digits = 2),
            "high" = my_table %>%
                filter (type == "high") %>%
                select (c ("var_A", "var_B")) %>%
                sapply (mean) %>%
                round (digits = 2 )
                )


>             low    mid   high
> var_A      0.14  -0.10   0.74
> var_B     -0.41   3.97  11.44

What I'd like to achieve is to add characters in order to indicate the results of the dunn.test. This could look something like

>               low         mid         high 
> var_A     0.14  a    -0.10  a      0.74  a
> var_B    -0.41  a     3.97 ab     11.44  b

So, my long but short question is: how can I tell the dunn.test function to put out the grouping-characters (eg. "a", "ab" or "b"). Or is there a workaround to get the desired charaters?

yenats
  • 531
  • 1
  • 3
  • 16
  • Canu you explain in natural language what that dplyr code is supposed to be doing and why you are choosing the particular lower case labels that you have added? It looks to me that there will need to be a `paste` operation and possibly an explicit print or cat to suppress the resulting double quotes, but I don't see where the `dunn.test` results are being used in your desired output. You do need to put in what you consider to be the correct answer and not just a random table in the shape you want. – IRTFM Aug 22 '19 at 19:52
  • Of course: a) in the table - creation process, I subset my_table by the variable types ("low", "mid" and "high") and then calculate the mean for each of the resulting groups and each of the independent variables (var_A and var_B); the result is then rounded to two digits. However, the creation of the table works fine and is not the core of my question. – yenats Aug 22 '19 at 20:25
  • b) in the `dunn.test` results, I can see that with respect to `var_A`, the three groups `low`, `mid` and `high` don't differ signifcantly. So they all get the same label. With respect to `var_B`, the groups differ: `low` is different from `high`. So I assigned: `low`: "a", `high`: "b" (beacuse they differ). `mid`: gets "ab", because there is no statistical difference between `mid` and the other two groups. The question is not about how to print a table with paste etc., but about how to tell the `dunn.test` function to put out the lower case labels that I have hadded. – yenats Aug 22 '19 at 20:31
  • I doubt that is an option in that implementation of the Dunn test. Seems fairly clear looking at the code that the author is not a particularly accomplished user of R. A proper R implementation would have a `print` and perhaps a `summary` method for an object of class "dunntest". Instead, the code in that package only returns a list with 5 elements and the info you seek is not among the items returned. The screen output is entirely constructed internally and sent to the screen with `cat` statements. If you wanted the screen output look at `?capture.output` – IRTFM Aug 22 '19 at 23:38
  • Ok, thank you. In summary, this means, that the `dunn.test` package is developed with minor R-skills and that I cannot find the desired the desired classification? – yenats Aug 23 '19 at 06:47
  • Yes to the first. And to the second it means you have come up with you own processing strategy since that package has limited options. – IRTFM Aug 23 '19 at 18:18
  • 1
    I know this is an old feed but it seems that you can use the `FSA::dunnTest` function to display the results in a matrix-like object that you can read with the `rcompanion::cldList` function in order to display the significance letters. Check this URL : https://rcompanion.org/handbook/F_08.html – Corto Aug 02 '21 at 16:05

1 Answers1

1

Maybe the kruskal() function in the agricolae package might get what you're looking for. Among the output is 'groups' which contain letters corresponding to group. Package details say that post-hoc is done using Fishers LSD though, not Dunn test. But can include p.adj argument for multiple comparisons adjustments

library(tidyverse)
library(agricolae)
library(reshape2)

my_table <- data.frame ("type" = c (rep ("low", 5), rep ("mid", 5), rep ("high", 5)),
                        "var_A" = rnorm (15),
                        "var_B" = c (rnorm (5), rnorm (5, 4, 0.1), rnorm (5, 12, 2)) 
)

# melt in order to use lapply 
my_MeltedTable = melt(my_table, id.vars='type')

# apply kruskal(value,type) across two levels of variable (var_A and var_B)
results = lapply(split(my_MeltedTable[,c("type", "value")], my_MeltedTable$variable), 
       function(x) kruskal(x$value, x$type, p.adj="bon"))

# the grouping information you'd like will be found in
results$var_A$group
results$var_B$group

Probably a way to pull out the things you need from within the lapply() but I don't know how, so here is how I got the table required:

# create empty df for results
resTable <- data.frame(matrix(ncol = 6, nrow = 2))

# results$means contains means of variable per group
# assign col names from row names in results
colnames(resTable) = row.names(results$var_A$means)

# pull out means for var_A & round to 2 digits & transpose as are rows
resTable[1,1:3] = round(digits = 2, t(results$var_A$means[,1])) 
# pull out means for var_B & round to 2 digits & transpose 
resTable[2,1:3] = round(digits = 2, t(results$var_B$means[,1])) 

# results$group contains letters denoting  of variable per group
resTable[1,4:6] = t(results$var_A$group[,2]) # pull out stat grouping for varA
resTable[2,4:6] = t(results$var_B$group[,2]) # pull out stat grouping for varB

resTable = resTable[,c(2,5,3,6,1,4)] # re-order cols
rownames(resTable) = c("var_A", "var_B") # name rows
colnames(resTable) = c("low", " ","med", " ", "high","") # name cols

And after all that long-windedness!

        low    med    high  
var_A  0.12 a 0.40 a -0.76 a
var_B -0.45 b 3.99 c 11.46 a
Cam
  • 87
  • 5