1

I have a vector with rownames, so it can be considered a "matrix" with 2 columns (one for filename, one for Topic):

> res
                   Topic
jardine-1.docx.md      1
jardine-2.docx.md      1
jardine-a1.docx.md     1
jardine-a2.docx.md     1
jardine-a3.docx.md     1
jardine-a4.docx.md     3
jardine-a5.docx.md     1
jardine-a6.docx.md     3
jardine-a7.docx.md     3
jardine-a8.docx.md     1
...

These are results from the awesome R package on topic modelling, aptly called topicmodels.

I want to cast this "vector" into wide format, just for presentation purposes.

This of courses break "tidy data" principles, where "each observation, or case, is in its own row" (see Data Transformation with dplyr, available here.) Nevertheless, the wide format is much neater than the long format:

              Topic1       Topic2             Topic3
1  jardine-1.docx.md jk-1.docx.md jardine-a4.docx.md
2  jardine-2.docx.md jk-2.docx.md jardine-a6.docx.md
3 jardine-a1.docx.md jk-4.docx.md jardine-a7.docx.md
4 jardine-a2.docx.md jk-5.docx.md  singtel-1.docx.md
5 jardine-a3.docx.md jk-6.docx.md  singtel-2.docx.md
6 jardine-a5.docx.md         <NA>  singtel-3.docx.md
7 jardine-a8.docx.md         <NA>  singtel-4.docx.md
8       jk-3.docx.md         <NA>  singtel-5.docx.md
9       jk-7.docx.md         <NA>               <NA>

This of course can be done in a variety of ways - one of which looks like this (warning: ugly)

# via cbind
T1=rownames(subset(res, Topic==1))
T2=rownames(subset(res, Topic==2))
T3=rownames(subset(res, Topic==3))
n=max(length(T1),length(T2),length(T3))
length(T1) <- n
length(T2) <- n
length(T3) <- n
cbind(T1,T2,T3)

My question:

Are there any other better ways of presenting this, considering that all code will be within a R Markdown file for presentation purposes?

hongsy
  • 1,498
  • 1
  • 27
  • 39
  • Convert to data.frame and then use "standard" reshaping tools? – talat Jun 15 '17 at 06:43
  • Possible duplicate of http://stackoverflow.com/questions/2185252 – zx8754 Jun 15 '17 at 06:52
  • Generally not a good idea to put unequal length vectors into a data.frame. See [this](https://stackoverflow.com/questions/7196450/create-a-data-frame-of-unequal-lengths) for some options If you really must do it. – Adam Quek Jun 15 '17 at 06:53
  • By _better ways of presenting this?_, you mean what? How to present the data? Or how to present the reshaping of the data? – psychOle Jun 15 '17 at 07:44
  • `res` is neither a vector nor a two-column matrix. Use `str` to check what you actually have there. – Roland Jun 15 '17 at 11:22
  • @herbaman I mean better ways of *presenting the data*. The reshaping is just a last minute resort of showing all items compactly. – hongsy Jun 19 '17 at 08:38

2 Answers2

2

I would create an interactive table in markdown with the DT package. Link to vignette

library(DT)

datatable(
  dataframe, class = 'cell-border stripe', extensions = c('Buttons', 'FixedColumns'), options = list(
    dom = 'Bfrtip', scrollX = TRUE, fixedColumns = TRUE,
    buttons = c('copy', 'csv', 'excel', 'pdf', 'print')
  )
)

Explore the vignette, it has a bunch of options such as: formating fields with colors and shapes, enabling user to add or remove columns interactivly, scorling through wide tables, ect.

Prometheus
  • 1,977
  • 3
  • 30
  • 57
  • thank you for this solution! this is (i believe) what I am looking for. May I check if this interactivity is preserved in an `.Rmd` R Markdown file exported to HTML? – hongsy Jun 19 '17 at 08:39
  • Sure. Its gonna work as a markdown, flexdashboard, shiny app, etc. You can see examples here: http://rstudio.github.io/DT/. – Prometheus Jun 19 '17 at 08:50
1

If you're just looking for more some cleaner code, maybe this will satisfy you ?

nmax <- max(table(res$Topic))
ntopics <- 3 # or ntopics <- max(res$Topic) to be more general
build_col <- function(i){rn <- rownames(subset(res,Topic==i)); rn <- c(rn,rep(NA,nmax-length(rn)))} # you may replace NA by "" here for it to look nicer
sapply(1:ntopics,build_col) %>% as.data.frame %>% setNames(paste0("Topic",1:ntopics))

#               Topic1 Topic2             Topic3
# 1  jardine-1.docx.md   <NA> jardine-a4.docx.md
# 2  jardine-2.docx.md   <NA> jardine-a6.docx.md
# 3 jardine-a1.docx.md   <NA> jardine-a7.docx.md
# 4 jardine-a2.docx.md   <NA>               <NA>
# 5 jardine-a3.docx.md   <NA>               <NA>
# 6 jardine-a5.docx.md   <NA>               <NA>
# 7 jardine-a8.docx.md   <NA>               <NA>
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167