2

I have a text file containing data like this:

   A             C             G    class     phylum       order
-0.000187   -0.219166   1.693306 Chordata   Monotremata   Mammalia  
0.015664    -0.264506   1.482692 Chordata   Batidoidimorpha   Chondrichthyes    
-0.404323   0.219374    2.230190 Platyhelminthes   Cyclophyllidea   Cestoda 

but of course it has a lot of rows. I want to plot this data in such a way that all the classes are plotted on the x-axis, each one of them has the A, C and G value plotted as geom_point, and that these points are connected using a line with a specific color depending on A,C or G. I managed to do this by using the plot and par functions, but now I want to do it using the ggplot library.

pietrodito
  • 1,783
  • 15
  • 24
weblover
  • 371
  • 2
  • 7
  • 15
  • your question is not clear. What are the lines supposed to connect? Are there supposed to be different lines for each column: A, C, & G? It may help if you post the code you currently have so people can translate. Also, try posting some sample data that others can use in their R session with something like `dput(head(x,10))`. How do `phylum` and `order` fit into the graph, if at all? – Chase Mar 24 '11 at 00:43

2 Answers2

10

The specifics of your question are a bit unclear, but the general approach to plotting multiple variables in one plot with ggplot graphics is to melt() the data.frame() first. I didn't follow how the points and lines are supposed to fit into your graph, but here's an approach that uses the colour parameter to plot the columns A, C, and G by class on the x-axis:

library(ggplot2)
library(reshape2)

df <- data.frame(a = rnorm(10), c = rnorm(10), g = rnorm(10), class = sample(letters[20:23], 10, TRUE))
df.m <- melt(df)
ggplot(df.m, aes(class, value, colour = variable)) +
  geom_point()
Andrie
  • 176,377
  • 47
  • 447
  • 496
Chase
  • 67,710
  • 18
  • 144
  • 161
0

I had a similar issue I wanted to plot.

The answer is, you need a NEW column, which can be set as the group interaction. Here, I created a column called V1, which designates which letter, each letter belongs too, then use aes(group=interaction(variable.factor, new.factor). In this example case, the column 'V1' is arbitrary.

  class variable       value V1
1      u        a  0.77041380  a
2      v        a  0.09461429  a
3      t        a  0.22704242  a
4      w        a -0.21501380  a
5      w        a -0.48246983  a
6      v        a  1.69609897  a
7      w        a -0.38847860  a
8      t        a  2.45669883  a
9      t        a  0.24774451  a
10     u        a  0.04195110  a
11     u        c  0.57444553  c
12     v        c  0.73172047  c
13     t        c -1.59409421  c
14     w        c -0.12679464  c
15     w        c  0.19424856  c
16     v        c -1.28742724  c
17     w        c -1.12103626  c
18     t        c -0.57090558  c
19     t        c  0.53798077  c
20     u        c -0.47777022  c
21     u        g -0.91249913  g
22     v        g -1.49256508  g
23     t        g -1.77449710  g
24     w        g  0.71426647  g
25     w        g  0.79678361  g
26     v        g -1.28814106  g
27     w        g -1.04701972  g
28     t        g  0.07309817  g
29     t        g  2.03606615  g
30     u        g  1.76030312  g

slightly modiftiyng the ggplot code above:

ggplot(df.m, aes(class, value, colour = variable, group = interaction(V1, variable))) +
  geom_point()+
  geom_line()

This results in a line which connects each letter, across each class (from the above answer). (sorry low rep, please follow link)

letters connected by line

If you want to separate the lines, use position=position_dodge()

ggplot(df.m, aes(class, value, colour = variable, group = interaction(V1, variable))) +
  geom_point(position = position_dodge(width = 0.2))+
  geom_line(position = position_dodge(width = 0.2))

letters connected by line with jitter

The take away here is you need a factor for group=interaction() which is separate from, and spans across your x axis.

CertainPerformance
  • 356,069
  • 52
  • 309
  • 320