Visualization of Likert responses using net stacked bar charts: how to compare between groups?

Question

I draw on this excellent entry on how to visualize Likert responses using R:

[ https://stats.stackexchange.com/questions/25109/visualizing-likert-responses-using-r-or-spss ]

Although the answers to the question are extremely helpful, I do not manage to compare groups within one plot. (If this does not work) I would appreciate if you helped me to combine multiple plots into one overall graph.

Many thanks!

#Packages needed#

install.packages(c('devtools', 'roxygen2', 'RSQLite', 'ipeds','reshape'), repos=c('http://cran.r-project.org', 'http://r-forge.r-project.org'))
require(devtools)
require(roxygen2)

library(ggplot2)
library(HH)
library(reshape)
library(gridExtra)

#Code to produce sample data similar to the one I used, i.e. a number of statement items (col1) measured by a 5-point Likert scale (col2), with a grouping variable (col3) and the responses in frequencies (col4; I set mean and s.d. so that numbers are positive).#

mydata1<-expand.grid(col1=c('item1', 'item2', 'item3', 'item4'), col2=c('0', '1', '2', '3', '4'), col3=c('T1'))
m<-2:7
s<-0:1
mydata1$col4=sapply(rnorm(20,m,s), function(x) {round(x,2)})
mydata1$col2<-factor(mydata1$col2, levels=c(0,1,2,3,4), labels=c("strongly disagree", "disagree", "neutral", "agree", "strongly agree"), ordered=TRUE)
mydata1<-reshape(mydata1, direction="wide", v.names="col4", timevar="col2", idvar="col1")

mydata2<-expand.grid(col1=c('item1', 'item2', 'item3', 'item4'),col2=c('0', '1', '2', '3', '4'),col3=c('T0'))
m<-2:7
s<-0:1
mydata2$col4=sapply(rnorm(20,m,s), function(x) {round(x,2)})
mydata2$col2<-factor(mydata2$col2, levels=c(0,1,2,3,4),labels=c("strongly disagree", "disagree", "neutral", "agree", "strongly agree"), ordered=TRUE)
mydata2<-reshape(mydata2, direction="wide", v.names="col4", timevar="col2", idvar="col1")

mydata3<-expand.grid(col1=c('item1', 'item2', 'item3', 'item4'),col2=c('0', '1', '2', '3', '4'),col3=c('C1'))
m<-2:7
s<-0:1
mydata3$col4=sapply(rnorm(20,m,s), function(x) {round(x,2)})
mydata3$col2<-factor(mydata3$col2,levels=c(0,1,2,3,4),labels=c("strongly disagree", "disagree", "neutral", "agree", "strongly agree"), ordered=TRUE)
mydata3<-reshape(mydata3, direction="wide", v.names="col4", timevar="col2", idvar="col1")

mydata4<-expand.grid(col1=c('item1', 'item2', 'item3', 'item4'),col2=c('0', '1', '2', '3', '4'),col3=c('C0'))
m<-2:7
s<-0:1
mydata4$col4=sapply(rnorm(20,m,s), function(x) {round(x,2)})
mydata4$col2<-factor(mydata4$col2,levels=c(0,1,2,3,4), labels=c("strongly disagree", "disagree", "neutral", "agree", "strongly agree"), ordered=TRUE)
mydata4<-reshape(mydata4, direction="wide", v.names="col4", timevar="col2", idvar="col1")

mydata<-rbind(mydata1, mydata2, mydata3, mydata4)
summary(mydata)

#Preparation of the data#

mydata$col4.neutral<-NULL
colnames(mydata)[colnames(mydata)=="col4.strongly disagree"]<-"Strongly disagree"
colnames(mydata)[colnames(mydata)=="col4.disagree"]<-"Disagree"
colnames(mydata)[colnames(mydata)=="col4.agree"]<-"Agree"
colnames(mydata)[colnames(mydata)=="col4.strongly agree"]<-"Strongly agree"

#PLOT#

items<-mydata[, c("Strongly disagree", "Disagree", "Agree", "Strongly agree")]
itemsg=likert(items, grouping =mydata$col3) 
plot(itemsg)

PROBLEM: The code produces one single plot but does not compare between groups. It seems as if it plots each item as it appears in mydata so if we manage to re-order the rows, we might be able to produce a plot that allows for easy comparison between items and groups.

> ro.mydata
            col1 col3 Strongly disagree Disagree Agree Strongly agree
item1 (T1) item1   T1              2.00     6.00  2.00           6.00
item1 (T0) item1   T0              2.00     6.00  2.00           6.00
item2 (T1) item2   T1              1.90     6.59  2.67           8.33
item2 (T0) item2   T0              3.57     6.76  3.23           9.03
item3 (T1) item3   T1              4.00     2.00  4.00           2.00
item3 (T0) item3   T0              4.00     2.00  4.00           2.00
item4 (T1) item4   T1              7.02     2.66  6.31           2.76
item4 (T0) item4   T0              3.56     3.63  4.74           3.21
item1 (C1) item1   C1              2.00     6.00  2.00           6.00
item1 (C0) item1   C0              2.00     6.00  2.00           6.00
item2 (C1) item2   C1              4.01     6.87  2.62           6.23
item2 (C0) item2   C0              2.95     5.95  3.69           5.36
item3 (C1) item3   C1              4.00     2.00  4.00           2.00
item3 (C0) item3   C0              4.00     2.00  4.00           2.00
item4 (C1) item4   C1              4.10     2.54  6.12           2.62
item4 (C0) item4   C0              4.57     1.94  3.64           2.86
>

enter image description here

I don't think you can do grouping in the way you're describing using the HH package - it looks like the example at CrossValidated was done using irutils. I'll see if I can work something up. — TARehman, Sep 05 '12 at 13:30
This is great! Thanks a lot. In the meantime I added in the part on the preparation of the data the following lines: `sort.mydata<-mydata[order(mydata$col1),] T.mydata<-sort.mydata[sort.mydata$col3 == "T1" | sort.mydata$col3 == "T0",] C.mydata<-sort.mydata[sort.mydata$col3 == "C1" | sort.mydata$col3 == "C0",] ro.mydata<-rbind(T.mydata, C.mydata) row.names(ro.mydata)<-c("item1 (T1)", "item1 (T0)", "item2 (T1)","item2 (T0)", "item3 (T1)","item3 (T0)", "item4 (T1)","item4 (T0)","item1 (C1)", "item1 (C0)", "item2 (C1)", "item2 (C0)","item3 (C1)","item3 (C0)", "item4 (C1)","item4 (C0)")` — TiF, Sep 05 '12 at 13:49
... and modified the final part of the code as follows: `items<-ro.mydata[, c("Strongly disagree", "Disagree", "Agree", "Strongly agree")] itemsg=likert(items) plot(itemsg)`. This produces a sorted plot with the item names as labels. If we succeeded to add some space between items (i.e. between e.g. item2 T1/T0 and item3 T1/T0) and between the two treatments (i.e. between the block of T1/T0s and C1/C0s) this would bring us very close to the imagined plot! Many many thanks for any ideas! — TiF, Sep 05 '12 at 13:52
Answered below, but I think the easiest solution is going to be processing each group using by(), and then plotting them all onto one plot using a specific layout. HTH. — TARehman, Sep 05 '12 at 14:21

score 4 · Accepted Answer · edited May 23 '17 at 10:30

So, this is something that is a bit outside of my expertise as the underlying function uses the lattice plotting feature in R, and I haven't really used it. With that said, I believe there is a way to accomplish what you would like by using by() to process each call of likert(), and then plotting them in a single plot using layout arguments.

Start with:

items_byg <- by(data=mydata[3:6],
                INDICES=mydata[2],
                FUN=likert,
                main="",xlab="",auto.key=list(columns=1,space="right"))

This does by-group processing of your data frame - the data are the four variables mydata[3:6], the index is mydata[2], and the function is likert() (from the HH package). Then, pass additional arguments to likert() - namely, making the main plot title blank, the x-labels blank, and changing the automatically generated key to the right side. I'm not totally familiar with the arguments to auto.key, but ?barchart will provide some information.

print(items_byg[[1]],position=c(0,0.75,1,1),more=TRUE)
print(items_byg[[2]],position=c(0,0.5,1,0.75),more=TRUE)
print(items_byg[[3]],position=c(0,0.25,1,0.5),more=TRUE)
print(items_byg[[4]],position=c(0,0,1,0.25))

The output of the by() will be a list with each element being a trellis object for that group of variables. Accordingly, we just print each of them to a single plot. As you can see, the first three have more=TRUE, which tells R to expect additional plots. Each also has an argument of position=c(x1,y1,x2,y2). Basically, each position argument gives the coordinates of the lower left corner and the upper right corner of each plot. A bit more info is available from this SO answer.

The outcome of this is the attached plot. It's not perfect by far, but I think it's a start. Note that you could change the by() group to group by question, rather than by group, if you wanted to visually compare each question between you groups.

You still need to fix a few things, obviously, like making everything line up nicely and eliminating the duplicated keys - there's some challenges there - but in principle, I believe this accomplishes what you want - nicely grouped and stacked barcharts.

Example of Stacked Plots Using Lattice

EDITED TO ADD

After reviewing what you were saying, I've made some tweaks that I think will work for you. Start with ordering your data by the items:

mydata <- mydata[order(mydata[1]),]

Next, we'll still use by() to get the groups, but with a few changes. First, we use the items as the indices, rather than the groups (since you want to visually compare the groups). So, each plot generated will be around a single item. We set ylab to be "Groups", and we use scales to label the y-axis with the group names.

items_byg <- by(data=mydata[3:6],
                INDICES=mydata[1],
                FUN=likert,
                main="",xlab="",ylab="Groups",auto.key=list(columns=1,space="right"),
                scales=list(y=list(labels=mydata[[2]])))

Now, we can use a loop to run the plots. We loop from 1 through the second-to-last plot (since it's the last plot, we need to remove more=TRUE. The position is the tricky part, but even that's not too bad. x1 is always going to be zero and x2 is always going to be 1 (lower left and upper right corner). We get the value of y1 by taking 1 (the top of the item) and subtracting x*1/dim(items_byg). So, if you have 5 items, the first one has it's lower corner at 1-(1)(1/5), which is 0.8. The second is at 1-(2)(1/5), which is 0.6, and so on. y2 is just one dimension larger than y1 (so, if y1 is 0.8, y2 should be 1.0, and so on). We also add an Item title for each item based on x, and then feed everything to print.

for(x in 1:(dim(items_byg)-1)) {

    x1 <- 0
    y1 <- 1-(x*(1/dim(items_byg)))
    x2 <- 1
    y2 <- y1+(1/dim(items_byg))

    items_byg[[x]]$main <- paste("Item",x,sep=" ")
    print(items_byg[[x]],position=c(x1,y1,x2,y2),more=TRUE)
}

The last little bit is finishing the final plot - creating an item number, and doing the final print() (which needs to be separate so you don't get another more=.

items_byg[[dim(items_byg)]]$main <- paste("Item",dim(items_byg),sep=" ")
print(items_byg[[dim(items_byg)]],position=c(0,0,1,0+(1/dim(items_byg))))

Running this, I get the image below, and it should generalize to multiple items with little difficulty.

enter image description here

Thanks! This is indeed a very nice idea. I really appreciate your effort! It seems to work very elegantly for a small number of items. Unfortunately, I have more than 20 items in my real data set - and no idea how to put the single plots together into one plot. I tried this for more than 30 minutes and I failed :- / Do you happen to know a way how to add some space between the groups of items in the solution I suggested above? That would be fantastic!! — TiF, Sep 05 '12 at 14:53
If you have 20 items in your dataset and 4 groups, it's not hard to generalize the code above. Adding the space between the groups of items that you have in your solution will be more difficult, because it involves changing the "scale" of your y-axis at specific points. Do you have 4 groups total? — TARehman, Sep 05 '12 at 15:13
Yes, I do have four groups in total (2 treatments, i.e. two treated and two control groups). Do you have an idea how to link the individual plots together with some space between each pair? : ) — TiF, Sep 05 '12 at 15:20
Just edited the answer again. If that doesn't work, not sure what else I can do to help. :) — TARehman, Sep 05 '12 at 15:50
Great! Many many thanks for your time and effort. This was really helpful. — TiF, Sep 05 '12 at 21:33

Visualization of Likert responses using net stacked bar charts: how to compare between groups?

1 Answers1