0

My data looks something like this:

There are 10,000 rows, each representing a city and all months since 1998-01 to 2013-9:

RegionName| State|  Metro|         CountyName|  1998-01|      1998-02|  1998-03

New York|   NY| New York|   Queens|         1.3414|   1.344|             1.3514

Los Angeles|    CA| Los Angeles|    Los Angeles|    12.8841|     12.5466|   12.2737

Philadelphia|   PA| Philadelphia|   Philadelphia|   1.626|    0.5639|   0.2414

Phoenix|            AZ| Phoenix|            Maricopa|    2.7046|       2.5525|  2.3472

I want to be able to do a plot for all months since 1998 for any city or more than one city.

I tried this but i get an error. I am not sure if i am even attempting this right. Any help will be appreciated. Thank you.

forecl <- ts(forecl, start=c(1998, 1), end=c(2013, 9), frequency=12)

plot(forecl)

Error in plots(x = x, y = y, plot.type = plot.type, xy.labels = xy.labels,  : 
  cannot plot more than 10 series as "multiple"
Henrik
  • 65,555
  • 14
  • 143
  • 159
user2892196
  • 59
  • 2
  • 6
  • 1
    Why start in 1998 if the data starts in 2005? You should post `dput(head(forecl))`. You should also specify what design for the plot. All years and months in sequence ,or all Jan-Dec stacked? Probably will work better if you reshape to long format. – IRTFM Oct 31 '13 at 22:36

3 Answers3

1

You might try

require(reshape)
require(ggplot2)
forecl <- melt(forecl, id.vars = c("region","state","city"), variable_name = "month")
forecl$month <- as.Date(forecl$month)
ggplot(forecl, aes(x = month, y = value, color = city)) + geom_line()
colcarroll
  • 3,632
  • 17
  • 25
  • thank you for your response. I tried forecl$month <- as.Date(forecl$month); forecl$month <- as.Date(forecl$month, "%d.%b.%Y"); forecl$month <- as.Date(forecl$month[,1], "%d.%b.%Y") and I get the same error: Error in as.Date.default(forecl$month[, 1], "%d.%b.%Y") : do not know how to convert 'forecl$month[, 1]' to class “Date”; I am trying to research what is wrong with my code... – user2892196 Nov 01 '13 at 00:07
  • Ah, sorry, didn't notice your date was in %Y-%m format. Try following [this example](http://stackoverflow.com/questions/6242955/converting-year-and-month-to-a-date-in-r) and use forecl$month <- as.Date(paste(forecl$month,"-01",sep=""), "%Y-%m-%d"), or using the zoo package. – colcarroll Nov 01 '13 at 05:06
1

To add to @JLLagrange's answer, you might want to pass city through facet_grid() if there are too many cities and the colors will be hard to distinguish.

ggplot(forecl, aes(x = month, y = value, color = city, group = city)) +
  geom_line() +
  facet_grid( ~ city)
maloneypatr
  • 3,562
  • 4
  • 23
  • 33
  • Good suggestion -- with 10000 rows, it might even make sense to get a 2-D grid, like facet_grid(state ~ metro), – colcarroll Nov 01 '13 at 05:10
0

Could you provide an example of your data, e.g. dput(head(forecl)), before converting to a time-series object? The problem might also be with the ts object.

In any case, I think there are two problems.

First, data are in wide format. I'm not sure about your column names, since they should start with a letter, but in any case, the general idea would be do to something like this:

test <- structure(list(
  city = structure(1:2, .Label = c("New York", "Philly"), 
  class = "factor"), state = structure(1:2, .Label = c("NY", 
  "PA"), class = "factor"), a2005.1 = c(1, 1), a2005.2 = c(2, 5
  )), .Names = c("city", "state", "a2005.1", "a2005.2"), row.names = c(NA, 
  -2L), class = "data.frame")

test.long <- reshape(test, varying=c(3:4), direction="long")

Second, I think you are trying to plot too many cities at the same time. Try:

plot(forecl[, 1])

or

plot(forecl[, 1:5])
andybega
  • 1,387
  • 12
  • 19
  • I edited my post to include real col names and few rows; R reads all col names starting with a character; thank you for your help – user2892196 Nov 01 '13 at 00:30