1

sorry for the multiple questions about R. I'm new and still learning! So I am currently trying to make a multiple-line line graph with my data. I have 3 treatment groups with 4 individuals each. I am planning on factoring those into 3 groups for R. First, I want to make sure my data is set up in such a way in excel that i could make this graph. Second, how would I go about doing this? Is ggplot the best tool or is there another package that could be utilized?

I would like to have my X-axis as the dates (these are 10.15.2015for eg.), my Y-axis as the weights, and my 3 treatment groups, Lean, AdLib, and HF, as the data lines. As I said above, I used datum$Group= factor(Datum$Group) to group the Pig individuals into their 3 treatment groups.

I have looked at other questions on here but it did not seem like they were what I wanted.

Here is my data:

dput(datum)
structure(list(X10.5.15 = c(56L, 54L, 61L, 39L, 52L, 66L, 48L, 
49L, 59L, 55L, 37L, 59L), X10.26.15 = c(76L, 70L, 72L, 61L, 79L, 
93L, 72L, 72L, 84L, 71L, 50L, 85L), X11.3.15 = c(82L, 76L, 88L, 
67L, 90L, 102L, 83L, 83L, 100L, 96L, 56L, 100L), X11.10.15 = c(87L, 
84L, 93L, 71L, 99L, 110L, 93L, 93L, 109L, 107L, 65L, 112L), X11.17.15 = c(93L, 
90L, 100L, 77L, 106L, 116L, 101L, 100L, 121L, 122L, 71L, 119L
), X11.24.15 = c(102L, 99L, 109L, 86L, 113L, 124L, 107L, 108L, 
128L, 128L, 80L, 122L), X12.3.15 = c(114L, 113L, 123L, 100L, 
118L, 132L, 122L, 118L, 143L, 142L, 91L, 137L), X12.10.15 = c(117L, 
117L, 125L, 106L, 134L, 141L, 130L, 126L, 152L, 151L, 98L, 148L
), X12.17.15 = c(125L, 122L, 134L, 112L, 150L, 154L, 135L, 134L, 
162L, 162L, 106L, 160L), X12.22.15 = c(128L, 127L, 135L, 114L, 
156L, 161L, 141L, 140L, 166L, 176L, 109L, 166L), X12.29.15 = c(135L, 
130L, 142L, 119L, 155L, 164L, 149L, 149L, 174L, 195L, 121L, 176L
), X1.5.16 = c(138L, 135L, 150L, 129L, 167L, 172L, 163L, 154L, 
185L, 205L, 128L, 182L), X1.12.16 = c(154L, 157L, 166L, 146L, 
180L, 188L, 173L, 163L, 200L, 208L, 140L, 188L), X1.19.16 = c(148L, 
151L, 165L, 141L, 180L, 182L, 171L, 176L, 211L, 219L, 149L, 197L
), X1.26.16 = c(154L, 151L, 171L, 148L, 192L, 196L, 181L, 179L, 
212L, 230L, 156L, 205L), X2.2.16 = c(162L, 162L, 179L, 154L, 
200L, 200L, 191L, 184L, 228L, 228L, 162L, 225L), X2.9.16 = c(172L, 
169L, 187L, 164L, 203L, 202L, 188L, 194L, 237L, 253L, 168L, 234L
), X2.16.16 = c(173L, 167L, 192L, 162L, 211L, 215L, 199L, 202L, 
233L, 258L, 173L, 238L), X2.23.16 = c(185L, 174L, 202L, 172L, 
220L, 218L, 208L, 204L, 253L, 254L, 185L, 239L), X2.29.16 = c(183L, 
169L, 202L, 166L, 216L, 220L, 204L, 206L, 256L, 269L, 187L, 252L
), Pig = c(102L, 105L, 108L, 204L, 101L, 104L, 106L, 602L, 103L, 
107L, 205L, 603L), Group = structure(c(3L, 3L, 3L, 3L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 2L), .Label = c("AdLib", "HF", "Lean"), class = "factor")), .Names = c("X10.5.15", 
"X10.26.15", "X11.3.15", "X11.10.15", "X11.17.15", "X11.24.15", 
"X12.3.15", "X12.10.15", "X12.17.15", "X12.22.15", "X12.29.15", 
"X1.5.16", "X1.12.16", "X1.19.16", "X1.26.16", "X2.2.16", "X2.9.16", 
"X2.16.16", "X2.23.16", "X2.29.16", "Pig", "Group"), row.names = c(NA, 
-12L), class = "data.frame")

Thanks for your help in advance!

Haley
  • 119
  • 1
  • 3
  • 16
  • If you can paste the your dataset into the question using the output from the `dput` function, you will be more likely to have a successful response to your question. Having your data as an image is of no help. See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?rq=1 for some guidance. – Dave2e Sep 09 '16 at 18:14
  • thank you! Got that up there. – Haley Sep 09 '16 at 18:22
  • Can you further explain what "I am planning on factoring those into 3 groups for R" means. And what to do with the `Pig` column – Pierre L Sep 09 '16 at 18:25
  • i used: `datum$Group=factor(datum$Group)` Instead of plotting each individual separately, I used this command to group them together. The "pig" refers to my sample ID number, the data is actually weights from pigs I was using in my experiments. – Haley Sep 09 '16 at 18:26
  • Like i said above, the `Pig` column, is simply the ID numbers of the individuals. This is why I grouped them together, so I can deal with those individuals in terms of the treatment groups they are in. They are not the same pig, they are 12 individuals. – Haley Sep 09 '16 at 18:34
  • We can post answers for you, but you should consider how you want the output to look in the end. – Pierre L Sep 09 '16 at 18:40
  • I edited the question above. Hope that helps. Thanks – Haley Sep 09 '16 at 18:44

1 Answers1

1
library(ggplot2)
library(reshape2)

#Remove the 'X' from the dates
names(datum) <- sub("^X", "", names(datum))

We should reshape the data to long format. The idea is to have one column for each type of data.

datum_mlt <- melt(datum, id=c("Group", "Pig"), variable.name="dates")
head(datum_mlt)
#   Group Pig   dates value
# 1  Lean 102 10.5.15    56
# 2  Lean 105 10.5.15    54
# 3  Lean 108 10.5.15    61
# 4  Lean 204 10.5.15    39
# 5 AdLib 101 10.5.15    52
# 6 AdLib 104 10.5.15    66

As you can see there is a column for values, dates, ids, and treatment groups. This makes it easier to organize the information for plotting.

There are ten thousand ways to do this depending on how you want the data to look. You did not specify, so here is one example. We can clean up the axes and make everything look better if the format is correct:

p <- ggplot(datum_mlt, aes(x=dates, y=value, colour=Group, group=Pig))
p + geom_line()

enter image description here

Edit

Before grouping individuals, I would first remove the 'Pig' column, it looks like it helps, but it doesn't.

datum2 <- datum[names(datum) != "Pig"]
library(dplyr)
datum2 %<>% group_by(Group) %>% summarise_all(mean)
d_melt <- melt(datum2, id="Group")

We plot the data. And try to make it look a little nicer.

p <- ggplot(d_melt, aes(x=variable, y=value, colour=Group, group=Group))
p <- p + geom_line()
p <- p + scale_x_discrete(name="Date", breaks=unique(d_melt$variable)[c(T,F,F)])
p + ggtitle("Grouped Weights Over Time") + theme_minimal()

enter image description here

Pierre L
  • 28,203
  • 6
  • 47
  • 69
  • Thank you for the start!! If i wanted to make 3 lines for the three treatment groups `AdLib, HF, Lean` how would I do that? Would it be better to average them in excel and then plot? I was hoping to have error bars, etc, so keeping the individuals but grouping to them together would be the best.. any suggestions? – Haley Sep 09 '16 at 18:49
  • Yes you would have to decide how you want to summarize the data from the individuals into one per treatment. mean, median, etc... – Pierre L Sep 09 '16 at 18:50
  • Is there a way, other than `datum$Group= factor(Datum$Group)` to keep them individual, but report them as a group? – Haley Sep 09 '16 at 18:51
  • You don't even have to do that. They will be grouped if instructed to do so in ggplot – Pierre L Sep 09 '16 at 18:52
  • If you notice I did not factor them. Also, R does that automatically :) But for argument's sake, even if they were character strings, it wouldn't be a problem – Pierre L Sep 09 '16 at 18:53
  • what is the code i need to add to this to group them in ggplot? – Haley Sep 09 '16 at 18:54
  • thank you, again, for all your help! I really appreciate it! I'm very new to all this, i'm sure you can tell – Haley Sep 09 '16 at 18:55
  • Np, I am writing a summary script now – Pierre L Sep 09 '16 at 18:55
  • You should remove the 'Pig' column before plotting. I know it looks like it helps, but it doesn't. – Pierre L Sep 09 '16 at 18:56
  • I added an update with cleaner output and grouped treatments. But the error bars, I don't get that part. – Pierre L Sep 09 '16 at 19:07
  • Thank you! If need be, I can probably add error bars in using another software program. Thank you so much for your help, i really appreciate it! – Haley Sep 09 '16 at 19:16
  • No problem. If you feel the answer helps, feel free to check the answer as accepted. This lets other users know that the question has been answered. – Pierre L Sep 09 '16 at 19:19
  • I ran the first line and got this error: `Warning message: package ‘dplyr’ was built under R version 3.2.5 > datum2 %<>% group_by(Group) %>% summarise_all(mean) > d_melt <- melt(datum2, id="Group") Error: could not find function "melt"` – Haley Sep 09 '16 at 19:21
  • You need `library(reshape2)`. Did you see it at the top of my answer? – Pierre L Sep 09 '16 at 19:22
  • 1
    Missed that. Thank you! – Haley Sep 09 '16 at 19:29
  • If I wanted to change the color of the lines, how would I specify that? Would it need wrapped in the `geom_line()` ? – Haley Sep 09 '16 at 20:45
  • you would use another function called scale_colour_discrete. Do a search for it. Report back – Pierre L Sep 09 '16 at 20:48