1

I want to make simple line plots similar to the ones on the tutorial for ggplot:

p1 <- ggplot(ChickWeight, aes(x=Time, y=weight, colour=Diet, group=Chick)) +
    geom_line() +
    ggtitle("Growth curve for individual chicks")

[From cookbook-r.com]

However, the examples given have the data organized efficiently (one column has the x variable, another column has the y variable).

If I have data that is not so neat (in my data, each row represents a changing observation of data), can I still use ggplot? Do I have to rearrange the data in the initial file to use ggplot?

For example, if my data reads:

Names       1991  1992  1993
Johny         40    50    80
Dana          78    70    90

How could I create a line plot for Johny's progress? Dana's?

alistaire
  • 42,459
  • 4
  • 77
  • 117
Haim
  • 123
  • 5
  • 2
    Possible duplicate of [Plotting two variables as lines using ggplot2 on the same graph](http://stackoverflow.com/questions/3777174/plotting-two-variables-as-lines-using-ggplot2-on-the-same-graph) – shrgm May 17 '16 at 18:29

3 Answers3

4

You need to reshape your data to long form before you can plot. Using dplyr and tidyr:

library(dplyr)
library(tidyr)
library(ggplot2)

df_clean <- df %>% 
    gather(year, value, num_range('X', 1991:1993)) %>% 
    mutate(year = extract_numeric(year))

df_clean
#   Names year value
# 1 Johny 1991    40
# 2  Dana 1991    78
# 3 Johny 1992    50
# 4  Dana 1992    70
# 5 Johny 1993    80
# 6  Dana 1993    90

ggplot(df_clean, aes(x = year, y = value, colour = Names)) + geom_line()

ggplot with two lines

Note you'll probably want to do a little cleaning still (the x-axis looks a little silly), but that's just polishing.


Data:

df <- structure(list(Names = structure(c(2L, 1L), .Label = c("Dana", 
        "Johny"), class = "factor"), X1991 = c(40L, 78L), X1992 = c(50L, 
        70L), X1993 = c(80L, 90L)), .Names = c("Names", "X1991", "X1992", 
        "X1993"), class = "data.frame", row.names = c(NA, -2L))
alistaire
  • 42,459
  • 4
  • 77
  • 117
1

You could also use the reshape function as follow

df <- data.frame(c("Johny", "Dana"), c(40, 78), c(50, 70), c(80, 90))
names(df) <- c("Names", 1991, 1992, 1993)
df
  Names 1991 1992 1993
1 Johny   40   50   80
2  Dana   78   70   90
new.df <- reshape(data = df, direction = "long", idvar = "Names", varying = list(2:4), v.names = "Value", times = 1991:1993)

p1 <- ggplot(new.df, aes(x = time , y= Value, colour = Names)) + 
           geom_line() +
           scale_x_continuous(breaks = 1991:1993)
p1

enter image description here

0

You already have a couple of solutions here, and the reshape option achieves your goal very cleanly. I will add one more way to reformat your data without any special R packages, this one relying on the stack function.

# Load ggplot2
library(ggplot2)

# Create example data
df <- data.frame(c("Johny", "Dana"), c(40, 78), c(50, 70), c(80, 90))
names(df) <- c("names", 1991, 1992, 1993)

# Create long data
df.long <- data.frame(names=rep(df$names, 3), stack(df[,2:4]))
df.long$ind <- as.numeric(df.long$ind)

# Plot
ggplot(df.long) + geom_line(aes(ind, values, colour=names))

SlowLoris
  • 995
  • 6
  • 28