1

Sorry, this might have an obvious answer but I'm a little unsure what to do for it.

Say for instance I have a dataset where I have a list of names of people, the number of sales they made, and the dates they made those sales, all in the following format:

Name    |    Date    |     Sales
------------------------------------
AAA     | 01/01/2001 |     50
AAA     | 01/02/2001 |     62
AAA     | 01/03/2001 |     73
...     |    ...     |     ...
AAA     | 05/15/2001 |     20
BBB     | 02/06/2001 |     51
BBB     | 02/09/2001 |     45
...     |    ...     |     ...
BBB     | 04/13/2001 |     3
CCC     | 01/22/2001 |     78
...     |    ...     |     ...
...     |    ...     |     ...

Basically, my data looks kinda like how it is above - there are multiple different names, and also the dates for each name do not align properly (e.g. one person may start much earlier in the year compared to another person and therefore has sales data much earlier in the year). In addition to that, the dates may skip forward a bit, where we may have a date 4/3/2001 and it then may move forward to 4/25/2001 in the next cell.

What I would like to do now is plot the data for the whole year such that I have all the different people (i.e. AAA, BBB, CCC,...) and all the sales they made along with the dates they made those sales all in one big plot.

Now, I can think of one way to do this - by first using the subset() function and subsetting the dataset by name, I may be able to plot the data in this way. The problem is is that I find this to be a bit inefficient, and I'm also sure that R must have far better ways to plot time series data even if the data is a little bit weird. If anyone has some suggestions or could provide a bit of help then I'd appreciate it, thanks in advance.

shiny
  • 3,380
  • 9
  • 42
  • 79
ThePlowKing
  • 341
  • 1
  • 4
  • 15
  • In the future, could you please provide a reproducible example http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example? – shiny Mar 27 '17 at 03:00
  • Extremely sorry, your comment was exactly what I was looking for but I wanted to properly respond to it before accepting it :) – ThePlowKing Mar 27 '17 at 07:32
  • No worries. I just wanted to help if my answer is not what you were looking for. – shiny Mar 27 '17 at 08:32

1 Answers1

3

Are you looking for something like this?

library(dplyr)
library(tidyr)
library(ggplot2)
#Create data.frame
Date <- as.Date(c(seq(as.Date("2001-01-03"), as.Date("2001-10-17"), by = 1), 
                  seq(as.Date("2001-05-10"), as.Date("2001-12-17"), by = 1),  
                  seq(as.Date("2001-04-12"), as.Date("2001-11-17"), by = 1)))
Name <- c(rep("AAA", 288), rep("BBB", 222), rep("CCC", 220))
Sales <- c(sample(10:20, 288, replace = T), sample(50:60, 222, replace = T), sample(80:90, 220, replace = T))
df <- data.frame(Name, Date, Sales)

#select specific rows(dates) to create irregular time series (missing dates)
df1 <- df[c(1:50, 100:150, 190:288, 289:370, 400:450, 480:510, 511:640, 670:730),] %>% 
  tidyr::spread(Name, Sales) 

#create a data.frame (df_whole_yr) that have continuous dates for whole 2001 
df_whole_yr <- data.frame(Date = seq(as.Date("2001-01-01"), as.Date("2001-12-31"), by = 1)) %>% 
  dplyr::left_join(., df1, by ="Date") %>% #join irregular timeseries df1 with the continuous timeseries df_whole_yr
  tidyr::gather("Name", "Sales", 2:4) %>% #convert it to long format
  ggplot(., aes(x =Date, y = Sales, color = Name))+ ##plot
    geom_line(size = 0.2)

enter image description here

shiny
  • 3,380
  • 9
  • 42
  • 79
  • 1
    This assumes that the three time series are regular though... Care to take into account the irregularity of OP's series? – acylam Mar 27 '17 at 02:11
  • 2
    @useR Thanks. I have updated the answer. Please, let me know if you think it still doesn't take into account the irregularity of the OP's series. It would have been much more easier if the OP provided a data.frame. – shiny Mar 27 '17 at 02:45
  • 1
    Thanks, this method is exactly what I was looking for, and also thanks for commenting on each line, that's quite helpful since I've never used those packages before and I would have no idea what each line would mean otherwise – ThePlowKing Mar 27 '17 at 07:34