1

I have CSV data where I have two columns and I need to plot Time as x axis and count as y axis. The time in the data ranges from 2008 Sep to 2021 Dec. Data points are monthly.

This is my data in csv format.

When I plotted the data, since there are lot of points for month and year, I am not able to see the labels in the x axis.

enter image description here

I want to put 5 certain time points in the x axis like below:

enter image description here

This is what I tried:

library(ggplot2)
theme_set(
  theme_bw(),
  theme(legend.position = "top")
)

result <- read.csv("Downloads/Questions Trend - Questions Trend.csv")
p <- ggplot(result, aes(result$Time_Formatted, result$VCS_Feature_History_Sanitize_Trnd)) + 
     geom_point(size = 0.1) + xlab("Month") + ylab("Temporal Trend")
p + geom_smooth(method = "loess", color = "red")

I tried below and could remove some points but still can not customize to specific points.

library(ggplot2)
library(scales)
theme_set(
  theme_bw(),
  theme(legend.position = "top")
)

result <- read.csv("Downloads/Questions Trend - Questions Trend.csv")
result$Time_Formatted <- as.Date(result$Time_Formatted)
p <- ggplot(result, aes(result$Time_Formatted, result$VCS_Feature_History_Sanitize_Trnd)) + 
     geom_point(size = 0.1) + xlab("Month") + ylab("Temporal Trend") +
    scale_x_date(date_breaks = "years" , date_labels = "%b-%y")
p + geom_smooth(method = "loess", color = "red")

enter image description here

How to give specific points in the x axis?

Setu Kumar Basak
  • 11,460
  • 9
  • 53
  • 85
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Please do not post data in an image. You probably just need to convert your `Time_Formatted` to a proper date/time column. It's likely just a character column now though hard to tell from the picture. – MrFlick Mar 07 '22 at 07:07
  • Try scale_x_date(date_breaks = "1 day", date_labels = '%Y-%m-%d') + or scale_x_date(date_breaks = "1 week", date_minor_breaks = '1 day', date_labels = '%Y\n%m-%d') – Rfanatic Mar 07 '22 at 07:10
  • From the cluttered x axis and the lack of smoother it looks like `result$Time_Formatted` is a string rather than date. Converting it to date before running the plot command will give you your smoother and more useful default x labels. This would be easier to spot if you'd provided your data using `dput` rather than as a screenshot (as requested in the tag wiki), which hides some information about the column formats – Miff Mar 07 '22 at 08:23
  • @Miff, I have provided the link of my data in the question. – Setu Kumar Basak Mar 07 '22 at 17:59
  • @MrFlick, I have provided the link of my data in the query. – Setu Kumar Basak Mar 07 '22 at 18:01
  • Data should not be stored in off-site links. Those links break overtime or may contain unsafe content. Plus you seemed to have just linked to a raw CSV which doesn't tell us how you imported the data into R so we can't recreate the exact data structure you have. You should share a subset if your data via a `dput()` as described in the link provided. – MrFlick Mar 07 '22 at 18:03

2 Answers2

2

You should use the scale_x_date() function of the ggplot2 package.

For example, here a code I'm always using in my work when I need to plot time data :

ggplot2::scale_x_date(name   = " ",
                        breaks = function(date) seq.Date(from = lubridate::ymd("2020-01-01")+1, 
                                                         to = lubridate::today(), 
                                                         by = "1 month"),
                        limits = c(lubridate::ymd("2020-07-13"),
                                   lubridate::today()),
                        expand  = c(0,0),
                        labels = scales::date_format("%b %Y"))

With breaks, you can choose to only show the first date of each month.

With labels and the function date_format() from the scales package, you can choose the date format, and you can basically do anything you want. Here, I choose to plot the month in letters and the year in number.

1

Most of your problems here are related to reading in the date data so that the format is correctly recognised - this can be done by specifying it explicitly:

result <- read.csv("Test.csv")
result$Time_Formatted <- as.Date(result$Time_Formatted, "%m/%d/%y")

Then it's simply a case of making a vector indicating where you want breaks and specifying this in the scale_x_date with:

date_breaks <- as.Date(c("1/7/10", "1/12/12", "1/1/14", "1/2/15", "1/3/16"), "%d/%m/%y")

p <- ggplot(result, aes(Time_Formatted, VCS_Feature_History_Sanitize_Trnd)) + 
      geom_point(size = 0.1) + xlab("Month") + ylab("Temporal Trend") +
      scale_x_date(breaks=date_breaks , date_labels = "%b-%y")
p + geom_smooth(method = "loess", color = "red")

Note that I've removed the explicit reference to "result" in the aes() function as that's unnecessary and depricated according to the warning it creates. The end result is:

Output plot

Miff
  • 7,486
  • 20
  • 20
  • If you're using dates a lot in your work it's worth getting familiar with the `lubridate` package as used in Kévin Legueult's answer. – Miff Mar 07 '22 at 19:39