3

I am re-writing this posting based on my progress after having advices from @PhilipLeifeld (see the comment section below).

I have tried to put clmm outputs to latex, using texreg. Since the package does not support clmm in its default mode, I tried to extend the package with extract function (see the answer part on Print "beautiful" tables for h2o models in R). Meanwhile, I found that code posted on https://gist.github.com/kjgarza/340201f6564ca941fe25 can be used as a starting point for me; I shall refer the code as the baseline code below. The following model (result) is pretty much representative of my actual codes.

library(ordinal)
library(texreg)
d<-data.frame(wine)
result<-clmm(rating~ 1+temp+contact+(1+temp|judge), data=d)

What I would like to display in a latex table are random effects components, which are omitted in the baseline code. The following is part of summary output.

summary(result)

Random effects:
 Groups Name        Variance Std.Dev. Corr  
 judge  (Intercept) 1.15608  1.0752         
        tempwarm    0.02801  0.1674   0.649 
 Number of groups:  judge 9 

Specifically, I want to display variance (and number of groups); I do not need correlation parts. While working on the baseline code, I also learned that "texreg" allows for only a limited set of arguments for a latex display and that the option of "include.variance" is relevant to my goal. Thus, I tried to add random effects components to a "gof" argument as including the option of "include.variance" in the baseline code.

Here is what I have done. First, I add "include.variance" to the part of defining extract.clmm function.

extract.clmm <- function(model, include.thresholds = TRUE, include.aic = TRUE, 
                     include.bic = TRUE, include.loglik = TRUE, include.variance = TRUE, oddsratios = TRUE, conf.level= 0.95, include.nobs = TRUE, ...) {
s <- summary(model, ...)

tab <- s$coefficients
thresh <- tab[rownames(tab) %in% names(s$alpha), ]
threshold.names <- rownames(thresh)
threshold.coef <- thresh[, 1]
threshold.se <- thresh[, 2]
threshold.pval <- thresh[, 4]
beta <- tab[rownames(tab) %in% names(s$beta), ]
beta.names <- rownames(beta)
beta.coef <- beta[, 1]
beta.se <- beta[, 2]
beta.pval <- beta[, 4]

Then, I added the following three lines.

### for random effect components###   
rand<-s$ST[[1]]
rand.names<-rownames(rand)
rand.var<-rand[,1]

The following part is what I additionally included to the baseline code ("include.variance").

if (include.variance == TRUE) {       
    gof.names <- c(gof.names, rand.names)
    gof <- c(gof, rand)
    gof.decimal <- c(gof.decimal, TRUE)   
}

After running the extract.clmm function, I ran the following.

test<-extract.clmm(result, include.variance=TRUE, oddsratios=FALSE)

Then, I got an error message: Error in validityMethod(object) : gof.names and gof must have the same length! While I found that the lengths of "rand" and "rand.names" in the case of "result" is 4 and 2, I do not know how to handle this. Any comments would be very appreciated. Thanks in advance.

Community
  • 1
  • 1
Han
  • 55
  • 9
  • Hi there. In the ``texreg`` case, you would basically want to edit an existing ``extract`` method to accommodate random effects and number of groups. Just yesterday, I happened to post a full-fledged tutorial on how to create or modify ``extract`` methods to get the results you want. You can find it in my answer here: http://stackoverflow.com/questions/38894044/print-beautiful-tables-for-h2o-models-in-r. Can you try that and report back if you are stuck somewhere? Try to recycle existing argument names like ``include.groups`` and ``include.variance`` if possibe (see package code). – Philip Leifeld Aug 26 '16 at 09:02
  • Thank!!! I will try and get back to you. Once again I appreciate the valuable information. Indeed, you are the author of texreg! – Han Aug 26 '16 at 09:08
  • When I worked on the paragraph that starts with "Let's start with the coefficient block", I found that 'result1@model' (<- my model case, corresponding to 'model.outupt.1@model in your explanation) yields an error message: "trying to get slot 'model' from an object (class "clmm") that is not an S4 object." So, I couldn't go further. – Han Aug 26 '16 at 10:15
  • Yeah, that's because ``clmm`` objects are not structured in the same way as ``h2o`` objects. The paragraph also states that you need to closely examine the data structure of the estimated object using the ``str`` command or something similar. That is, look closely at ``str(result1)`` and ``str(summary(result1))`` to find the information you need, such as the coefficients. Hint: The extract method on GitHub you referenced above already shows that the coefficient table is stored under ``summary(result1)$coefficients``. That method already does most of the work for you and needs to be adjusted. – Philip Leifeld Aug 26 '16 at 15:37
  • Thank you for getting back to help me. Thank to your advice, I have solved confidential interval issues. Meanwhile, as working on incorporating random effect components, I confront an issue. First of all, I found that `summary(result1)$ST` returns necessary information about random effects. So, using that information, I add to the GitHub code `random<-s$ST` `random.names<-rownames(random)` `random.var<-random[,1]`. – Han Aug 27 '16 at 12:35
  • While the extract.clmm function does not yield an error message, I got an error message while applying the function to `result1`. At this moment, my best guess is that the necessary information is attached not to `result1$ST` but to `result1$ST$judge`. If you could give additional hints, I will retry to solve this problem. Thanks in advance. – Han Aug 27 '16 at 12:36
  • It would be helpful if you could update your question with your recent progress. That would allow people to run your code and see why it doesn't work. That said, ``s$ST`` doesn't look like the out under "Random effects" in the summary. Are you sure these are the random effect variances? Looking at the clmm source code (https://github.com/cran/ordinal/blob/master/R/clmm.methods.R#L168), it looks like they are using a package-internal, non-exported function called ``formatVC`` to compute random effect variances ad hoc. That would be bad news because you can't access it from the outside. – Philip Leifeld Aug 28 '16 at 17:16
  • The ``formatVC`` function is just for the formatting of the random effect block, as the name implies. What you need is the (also internal) ``varcov`` function. So to get the variances, use ``diag(ordinal:::varcov(result1)[[1]])``; to get standard deviations, take the ``sqrt`` of that. This kind of sucks because it means I won't be able to include your ``extract.clmm`` function in the ``texreg`` package later as the use of ``:::`` to access package-internal functions is discouraged on CRAN. Unless you can convince the ``clmm`` package authors to export their ``varcov`` function, of course. – Philip Leifeld Aug 28 '16 at 17:38
  • Also note that the ``[[1]]`` index only retrieves the first set of random variances (i.e., for the ``judge`` conditioning variable). So you may need to iterate through the list and retrieve the remaining components as well should they exist, i.e., should people use more conditioning variables. At least that's how I understand the data structure. – Philip Leifeld Aug 28 '16 at 17:41
  • Note: in my earlier comment, the link with the correct line pointer should have been: https://github.com/cran/ordinal/blob/master/R/clmm.methods.R#L174 – Philip Leifeld Aug 28 '16 at 17:42
  • Ha, sorry for my ignorance, but it looks like you *can* actually generate the variances from the ``s$ST`` component by computing ``diag(s$ST[[1]] %*% t(s$ST[[1]]))`` for each conditioning variable (substitute the index ``[[1]]`` as mentioned before). So you should write a for-loop or some ``sapply`` or ``lapply`` command to get this for the different conditioning variables that may be present. – Philip Leifeld Aug 28 '16 at 17:57
  • @PhilipLeifeld Thank you so much for the detailed comments. I will work on and shortly update my progress as you advised. – Han Aug 29 '16 at 08:30
  • No problem! I saw that you edited your answer. Here are a couple more hints: 1) The ``extract.clmm`` function in your question seems to be incomplete. The closing bracket is missing, as is the part where you create and return a ``texreg`` object. 2) For the random effects component, try my solution above. I.e., work with ``diag(s$ST[[i]] %*% t(s$ST[[i]]))`` for all ``i`` present in the length of the list ``s$ST``. 3) For each element of ``gof`` and ``gof.names``, you will need one corresponding element in ``gof.decimal``; here, you are only adding one ``TRUE`` value instead of at least two. – Philip Leifeld Aug 29 '16 at 23:16

2 Answers2

5

Let's first rewrite your test case such that it contains both a model with random effects (clmm) and a model without random effects (clm), both from the ordinal package. This will allow us to check if the extract.clmm function we are about to write yields results that are formatted in a compatible way with the existing extract.clm function in the texreg package:

library("ordinal")
library("texreg")
d <- data.frame(wine)
result.clmm <- clmm(rating ~ 1 + temp + contact + (1 + temp|judge), data = d)
result.clm <- clm(rating ~ 1 + temp + contact, data = d)

The existing clm method for the generic extract function in texreg looks like this, and we will be able to use it as a template for writing a clmm method as both object types are structured in similar ways:

# extension for clm objects (ordinal package)
extract.clm <- function(model, include.thresholds = TRUE, include.aic = TRUE, 
    include.bic = TRUE, include.loglik = TRUE, include.nobs = TRUE, ...) {
  s <- summary(model, ...)

  tab <- s$coefficients
  thresh <- tab[rownames(tab) %in% names(s$aliased$alpha), , drop = FALSE]
  threshold.names <- rownames(thresh)
  threshold.coef <- thresh[, 1]
  threshold.se <- thresh[, 2]
  threshold.pval <- thresh[, 4]
  beta <- tab[rownames(tab) %in% names(s$aliased$beta), , drop = FALSE]
  beta.names <- rownames(beta)
  beta.coef <- beta[, 1]
  beta.se <- beta[, 2]
  beta.pval <- beta[, 4]
  if (include.thresholds == TRUE) {
    names <- c(beta.names, threshold.names)
    coef <- c(beta.coef, threshold.coef)
    se <- c(beta.se, threshold.se)
    pval <- c(beta.pval, threshold.pval)
  } else {
    names <- beta.names
    coef <- beta.coef
    se <- beta.se
    pval <- beta.pval
  }

  n <- nobs(model)
  lik <- logLik(model)[1]
  aic <- AIC(model)
  bic <- BIC(model)
  gof <- numeric()
  gof.names <- character()
  gof.decimal <- logical()
  if (include.aic == TRUE) {
    gof <- c(gof, aic)
    gof.names <- c(gof.names, "AIC")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.bic == TRUE) {
    gof <- c(gof, bic)
    gof.names <- c(gof.names, "BIC")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.loglik == TRUE) {
    gof <- c(gof, lik)
    gof.names <- c(gof.names, "Log Likelihood")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.nobs == TRUE) {
    gof <- c(gof, n)
    gof.names <- c(gof.names, "Num.\ obs.")
    gof.decimal <- c(gof.decimal, FALSE)
  }

  tr <- createTexreg(
      coef.names = names, 
      coef = coef, 
      se = se, 
      pvalues = pval, 
      gof.names = gof.names, 
      gof = gof, 
      gof.decimal = gof.decimal
  )
  return(tr)
}

setMethod("extract", signature = className("clm", "ordinal"), 
    definition = extract.clm)

The first difference for clmm objects is that the coefficients etc. are not stored under summary(model)$aliased$alpha and summary(model)$aliased$beta, but directly under summary(model)$alpha and summary(model)$beta.

The second thing we need to do is add goodness-of-fit elements for the number of groups and the random variances.

The number of groups is apparently stored under summary(model)$dims$nlev.gf, with multiple entries for the different conditioning variables. So that's easy.

The random variances are not stored anywhere, so we need to look this up in the source code of the ordinal package. We can see there that the print.summary.clmm function uses an internal helper function called formatVC to print the variances. This function is contained in the same R script and basically just does the formatting and calls another internal helper function called varcov (also contained in the same file) to compute the variances. That function, in turn, computes the transposed crossproduct of model$ST to get the variances. We can simply do the same thing directly in the GOF block of our extract.clmm function, e.g., using diag(s$ST[[1]] %*% t(s$ST[[1]])) for the first random effect. We just have to make sure we do that for all random effects, which means we need to put this in a loop and replace [[1]] by an iterator like [[i]].

The final clmm method for the extract function could look like this:

# extension for clmm objects (ordinal package)
extract.clmm <- function(model, include.thresholds = TRUE,
    include.loglik = TRUE, include.aic = TRUE,  include.bic = TRUE,
    include.nobs = TRUE, include.groups = TRUE, include.variance = TRUE, ...) {
  s <- summary(model, ...)

  tab <- s$coefficients
  thresh <- tab[rownames(tab) %in% names(s$alpha), ]
  threshold.names <- rownames(thresh)
  threshold.coef <- thresh[, 1]
  threshold.se <- thresh[, 2]
  threshold.pval <- thresh[, 4]
  beta <- tab[rownames(tab) %in% names(s$beta), ]
  beta.names <- rownames(beta)
  beta.coef <- beta[, 1]
  beta.se <- beta[, 2]
  beta.pval <- beta[, 4]

  if (include.thresholds == TRUE) {
    cfnames <- c(beta.names, threshold.names)
    coef <- c(beta.coef, threshold.coef)
    se <- c(beta.se, threshold.se)
    pval <- c(beta.pval, threshold.pval)
  } else {
    cfnames <- beta.names
    coef <- beta.coef
    se <- beta.se
    pval <- beta.pval
  }

  gof <- numeric()
  gof.names <- character()
  gof.decimal <- logical()
  if (include.loglik == TRUE) {
    lik <- logLik(model)[1]
    gof <- c(gof, lik)
    gof.names <- c(gof.names, "Log Likelihood")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.aic == TRUE) {
    aic <- AIC(model)
    gof <- c(gof, aic)
    gof.names <- c(gof.names, "AIC")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.bic == TRUE) {
    bic <- BIC(model)
    gof <- c(gof, bic)
    gof.names <- c(gof.names, "BIC")
    gof.decimal <- c(gof.decimal, TRUE)
  }
  if (include.nobs == TRUE) {
    n <- nobs(model)
    gof <- c(gof, n)
    gof.names <- c(gof.names, "Num.\ obs.")
    gof.decimal <- c(gof.decimal, FALSE)
  }
  if (include.groups == TRUE) {
    grp <- s$dims$nlev.gf
    grp.names <- paste0("Groups (", names(grp), ")")
    gof <- c(gof, grp)
    gof.names <- c(gof.names, grp.names)
    gof.decimal <- c(gof.decimal, rep(FALSE, length(grp)))
  }
  if (include.variance == TRUE) {
    var.names <- character()
    var.values <- numeric()
    for (i in 1:length(s$ST)) {
      variances <- diag(s$ST[[i]] %*% t(s$ST[[i]]))
      var.names <- c(var.names, paste0("Variance: ", names(s$ST)[[i]], ": ", 
          names(variances)))
      var.values <- c(var.values, variances)
    }
    gof <- c(gof, var.values)
    gof.names <- c(gof.names, var.names)
    gof.decimal <- c(gof.decimal, rep(TRUE, length(var.values)))
  }

  tr <- createTexreg(
      coef.names = cfnames, 
      coef = coef, 
      se = se, 
      pvalues = pval, 
      gof.names = gof.names, 
      gof = gof, 
      gof.decimal = gof.decimal
  )
  return(tr)
}

setMethod("extract", signature = className("clmm", "ordinal"), 
    definition = extract.clmm)

You can just execute the code at runtime and texreg should be able to create tables from clmm objects, including random variances. I will add this code to the next texreg release.

You can apply this to your example as follows:

screenreg(list(result.clmm, result.clm), single.row = TRUE)

The result is compatible across clmm and clm objects, as you can see here in the output:

==================================================================
                              Model 1            Model 2          
------------------------------------------------------------------
tempwarm                        3.07 (0.61) ***    2.50 (0.53) ***
contactyes                      1.83 (0.52) ***    1.53 (0.48) ** 
1|2                            -1.60 (0.69) *     -1.34 (0.52) ** 
2|3                             1.50 (0.60) *      1.25 (0.44) ** 
3|4                             4.22 (0.82) ***    3.47 (0.60) ***
4|5                             6.11 (1.02) ***    5.01 (0.73) ***
------------------------------------------------------------------
Log Likelihood                -81.55             -86.49           
AIC                           181.09             184.98           
BIC                           201.58             198.64           
Num. obs.                      72                 72              
Groups (judge)                  9                                 
Variance: judge: (Intercept)    1.16                              
Variance: judge: tempwarm       0.03                              
==================================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

You can use arguments include.variances == FALSE and include.groups == FALSE to switch off the reporting of the variances and group sizes if you want.

Philip Leifeld
  • 2,253
  • 15
  • 24
2

As a quick remark on @Philip's answer, in the new version or R studio the following does not return a matrix:

thresh <- tab[rownames(tab) %in% names(s$alpha), ]

This causes the following code to return an error. However, a quick fix for this can be:

thresh <- subset.matrix(tab, rownames(tab) %in% names(s$alpha) )
beta <- subset.matrix(tab, rownames(tab) %in% names(s$beta) )
Tagc
  • 8,736
  • 7
  • 61
  • 114
  • Thanks. I have fixed this as follows both in my reply and the package on [GitHub](http://github.com/leifeld/texreg): ``thresh <- tab[rownames(tab) %in% names(s$alpha), , drop = FALSE]`` and ``beta <- tab[rownames(tab) %in% names(s$beta), , drop = FALSE]``. I will push it to CRAN at some point. – Philip Leifeld Jan 18 '17 at 10:33