I am trying to use the melt() function
from the “reshape2”
package in R to stack a dataframe while keeping categorical labels for the individual observations. My question is how do I adapt Eric Cai's code
Code to produce multiple side-by-side notched boxplots at the level of behaviours$Family (a 2 level factor column) grouped by each behavioural variable for the data-set called behviours (a link to the Dummy data is supplied below)?
My aim is to colour code these multiple notched boxplots for each family (V4=red and W3 = blue)
with a legend. However, I am encountering an issue with dimensions when trying to arrange the dataframe using the melt()
function, from which I am having trouble deciphering. If anyone can help then many thanks in advance.
The reproducible dummy data is found at bottom of a stack overflow page Reproducible data
Here is an example:
I am trying to follow Eric Cai's instructions
(1) Stack the data:
(a) Retain the categorical (2 level factor column) for family [,1]
(b) Retain all behavioural variables [,2:13]
#Set vectors for labelling the data
behaviours.label=c("Swimming",
"Not.Swimming",
"Running",
"Not.Running",
"Fighting",
"Not.Fighting",
"Resting",
"Not.Resting",
"Hunting",
"Not.Hunting",
"Grooming",
"Not.Grooming")
family.labels=c("V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8",
"V4", "G8")
library(tidyr)
data_long <- gather(behaviours, x, Mean.Value, Swimming:Not.Grooming)
head(data_long)
# stack the data while retaining the Family and behavioural variables
stacked.data = melt(behaviours, id = c('Family', 'behaviours'))
# remove the column that gives the column name variable
stacked.data = stacked.data[, -3]
#head(stacked.data)
colnames(stacked.data)<-c("Family", "Behaviours", "Values")
Generating the Box Plots
Generate an object called boxplots.double, which will use the formula text{Mean.value ~ Family + Behaviours} to separate the plots into 12 groups of doublets (i.e. each behaviour will be grouped at the level of behaviours$family in a single plot). In Eric Cai's code “at = ” is an option to specify the locations of the box plots along the horizontal axis, and xaxt = ‘n’ suppresses the default horizontal axis which adds custom axis with the axis() and title()
boxplots.double = boxplot(values~Family + Behaviours,
data = stacked.data,
at = c(1:24),
xaxt='n',
ylim = c(min(0, min(-3)),
max(7, na.rm = T)),
notch=TRUE,
col = c("red", "blue"),
names = c("V4", "G8"),
cex.axis=1.0,
srt=45)
axis(side=1, at=c(1.8, 6.8), labels=c("Swimming",
"Not.Swimming",
"Running",
"Not.Running",
"Fighting",
"Not.Fighting",
"Resting",
"Not.Resting",
"Hunting",
"Not.Hunting",
"Grooming",
"Not.Grooming"), line=0.5, lwd=0)
Error message
Error in axis(side = 1, at = 1:24, labels = c("V4", "G8"), xaxt = "n", :
'at' and 'labels' lengths differ, 24 != 2
In addition: Warning message:
In bxp(list(stats = c(-1.20186549488911, -0.970033304559564, -0.465271399251147, :
some notches went outside hinges ('box'): maybe set notch=FALSE