0

I have some experience with base R but am trying to learn tidyverse and ggplot. I have a dataframe with 4 columns of data. I want a simple x-y plot, where the first column of data is on the x-axis, and the data in the other 3 columns is plotted on the y-axis, resulting in 3 lines on one plot. The first 15 lines of my data look like this (sorry about the image - I don't know how to insert a sample of my data):

screen shot - first 15 rows of data

I tried to plot the second and third columns of data as follows: ,

ggplot(data=SWRC_SL, aes(x=SWRC_SL$pressure_head, y=SWRC_SL$UNSODA_theta)) + 
geom_line(colour="red") + scale_x_log10() +
ggplot(data=SWRC_SL, aes(x=SWRC_SL$pressure_head, y=SWRC_SL$Vrugt_theta)) + 
geom_line(colour="blue") + scale_x_log10()

I get this error:

Error: Don't know how to add ggplot(data = SWRC_SL, aes(x = SWRC_SL$pressure_head, y = SWRC_SL$Vrugt_theta)) to a plot

I believe I should be using something like "group=" to indicate which columns should be plotted, but I haven't been able to find an example that shows how you can use gglot to plot data across multiple columns. What am I missing ?

jazzurro
  • 23,179
  • 35
  • 66
  • 76
  • To provide a good reproducible example, you can check this post: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – dc37 Jan 22 '20 at 05:32

2 Answers2

2

ggplot() is only ever called once when you create a chart. Try with the following:

ggplot() + 
  geom_line(data=SWRC_SL, aes(x=pressure_head, y=UNSODA_theta), colour="red") + 
  geom_line(data=SWRC_SL, aes(x=pressure_head, y=Vrugt_theta), colour="blue") + 
  scale_x_log10()

A better method would be to turn your data to long, where the UNSODA_theta and Vrugt_theta data are in the same column (say thetas), and have another column (say type_theta) indicating whether the data is for UNSODA_theta or Vrugt_theta. Then you could do the following:

ggplot(data=SWRC_SL, aes(x=pressure_head, y=thetas, colour=type_theta)) + 
  geom_line() + 
  scale_x_log10()

This is more desirable because ggplot2 will include a legend indicating what type of theta the colours are applied to.

Phil
  • 7,287
  • 3
  • 36
  • 66
  • Converting the data to long is the best way to go, it's what ggplot is designed to expect and it makes a lot of ggplot's features work automatically. – Marius Jan 22 '20 at 05:14
  • Thanks for all the comments. I used the code provided by dc37 and the results are great. – cookie monster Jan 23 '20 at 04:09
2

As suggested by @Marius, the most efficient way to plot your data is to convert them into a long format.

Using tidyverse, you can have the use of pivot_longer function (from tidyr package) and write the following code:

library(tidyverse)
SWRC_SL %>% pivot_longer(.,-pressure_head, names_to = "variable", values_to = "value") %>%
  ggplot(aes(x = pressure_head, y = value, color = variable))+
  geom_line()+
  scale_x_log10()

EDIT: Illustrating example

Using this dummy dataset:

  pressure UNSODA_theta Vrugt_theta Cassel_theta
1        0   -1.4672500   1.4119747   -2.0553118
2        1    0.5210227   0.6189239    1.4817574
3        2   -0.1587546   1.4094018    2.2796175
4        3    1.4645873   2.6888733   -0.4631109
5        4   -0.7660820   2.5865884   -1.8799346
6        5   -0.4302118   0.6690922    0.9633620

First, you pivot your data into a long format:

df %>% pivot_longer(.,-pressure, names_to = "variable", values_to = "value")

# A tibble: 45 x 3
   pressure variable      value
      <int> <chr>         <dbl>
 1        0 UNSODA_theta -1.47 
 2        0 Vrugt_theta   1.41 
 3        0 Cassel_theta -2.06 
 4        1 UNSODA_theta  0.521
 5        1 Vrugt_theta   0.619
 6        1 Cassel_theta  1.48 
 7        2 UNSODA_theta -0.159
 8        2 Vrugt_theta   1.41 
 9        2 Cassel_theta  2.28 
10        3 UNSODA_theta  1.46 
# … with 35 more rows

Now, your data are suitable for the plotting with ggplot2, you can directly add ggplot command to the previous command by adding a "pipe" (%>%) between them:

library(tidyverse)
df %>% pivot_longer(.,-pressure, names_to = "variable", values_to = "value") %>%
  ggplot(aes(x = pressure, y = value, color = variable))+
  geom_line()+
  scale_x_log10()

And you get the following plot with legend included: enter image description here

Data example

structure(list(pressure = 0:14, UNSODA_theta = c(-1.46725002909224, 
0.521022742648139, -0.158754604716016, 1.4645873119698, -0.766081999604665, 
-0.430211753928547, -0.926109497377437, -0.17710396143654, 0.402011779486338, 
-0.731748173119606, 0.830373167981674, -1.20808278630446, -1.04798441280774, 
1.44115770684428, -1.01584746530465), Vrugt_theta = c(1.41197471231751, 
0.61892394889108, 1.40940183965093, 2.68887328620405, 2.58658843344197, 
0.669092199317234, -1.28523553529247, 3.49766158983416, 1.66706616676549, 
1.5413273359637, 0.986600476854091, 1.51010842295293, 0.835624168230333, 
1.42069464325451, 0.599753256022356), Cassel_theta = c(-2.05531181632119, 
1.48175740118232, 2.27961753824932, -0.46311085383842, -1.87993463341154, 
0.963361958516736, -0.0670637053409687, -2.59982761023726, 0.00319778952040447, 
-0.945450500892219, -0.511452869790608, -1.73485854395378, 2.7047128618762, 
-0.496698054586832, -2.40827011837962)), class = "data.frame", row.names = c(NA, 
-15L))
dc37
  • 15,840
  • 4
  • 15
  • 32