-3

I want a line graph of around 145 data observations using R, the format of data is as below

Date  Total Confirmed  Total Deceased
3-Mar    6               0
4-Mar    28              0
5-Mar    30              5
.
.
.
141 more obs like this

I'm new to ggplot 2 in R so i don't know how to get the graph, I tried plotting the graph, but the dates in x-axis becomes overlaped and were not visible. I want line graph of Total confirmed column and the Total Deceased column together with dates in the x- axis, please help and please also tell me how to colour the line graph, i want a colorfull graph, so... Please Do help in your busy schedule.. thank you so much...

Similar questions like this gives a lot of error, so I would like an answer for my specific requirements.

Henrik
  • 65,555
  • 14
  • 143
  • 159
RISHI
  • 1
  • 3
  • Does this answer your question? [Plotting two variables as lines using ggplot2 on the same graph](https://stackoverflow.com/questions/3777174/plotting-two-variables-as-lines-using-ggplot2-on-the-same-graph) – emilliman5 Jul 23 '20 at 12:29
  • @emilliman5 NO, I'm really a beginner and i can't alter it, it gives lot of errors it would be helpful to my same question answered. – RISHI Jul 23 '20 at 12:55
  • 1
    Hi RISHI, welcome to Stack Overflow. This is a question and answer site which focuses on answering **specific** programming questions. It is not a tutorial site. Here are two tutorials I found on Google about [line graphs in base R](https://www.statmethods.net/graphs/line.html) or [line graphs in ggplot2](http://www.sthda.com/english/wiki/ggplot2-line-plot-quick-start-guide-r-software-and-data-visualization). emilliman's comment was automatically added by the system when they voted to close your question as a duplicate. – Ian Campbell Jul 23 '20 at 13:03

1 Answers1

1

There are a lot of resources to help you create what you are looking to do - and even quite a few questions already answered here. However, I understand it's tough starting out, so here's a quick example to get you started.

Sample Data:

df <- data.frame(
  dates=c('2020-01-01','2020-02-01','2020-03-03','2020-03-14','2020-04-01'),
  var1=c(13,15,18,29,40),
  var2=c(5,8,11,13,18)
)

If you are plotting by date on your x axis, you need to ensure that df$dates is formatted as a "Date" class (or one of the other date-like classes). You can do that via:

df$dates <- as.Date(df$dates, format='%Y-%m-%d')

The format= argument of as.Date() should follow the conventions indicated in strptime(). Just type ?striptime in your console and you can see in the help for that function how the various terms are defined for format=.

The next step is very important, which is to recognize that the data is in "wide" format, not "long" format. You will always want your data in what is known as Tidy Data format - convenient for any analysis, but necessary for ggplot2 and the related packages. In your data, the measure itself is numbers of cases and deaths. The measure itself is number of people. The type of the measure is either cases or deaths. So "number of people" is spread over two columns and the information on "type of measure" is stuck as a name for each column when it should be a variable in the dataset. Your goal should be to gather() those two columns together and create two new columns: (1) one to indicate if the number is "cases" or "deaths", and (2) the number itself. In the example I've shown you can do this via:

library(dplyr)
library(tidyr)
library(ggplot2)

df <- df %>% gather(key='var_name', value='number', -dates)

The result is that the data frame has columns for:

  • dates: unchanged
  • var_name: contains either var1 or var2 as a character class
  • number: the actual number

Finally, for the plot, the code is quite simple. You apply dates to the x aesthetic, number to y, and use var_name to differentiate color for the line geom:

ggplot(df, aes(x=dates, y=number)) +
  geom_line(aes(color=var_name))

enter image description here

Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
chemdork123
  • 12,369
  • 2
  • 16
  • 32