I have a monthly data on sales volumes:
YearMonth Sales Count
2010-04 300
2010-05 342
2010-06 425
and I just want to draw a line graph in r to observe the trend.
I use ggplot2 in r:
ggplot(data,
aes(x = YearMonth, y = `Sales Count`)) +
geom_line()
However, r gives me an error message:
geom_path: Each group consists of only one observation.
Do you need to adjust the group aesthetic?
I tried many ways to convert variable "YearMonth" to a numeric variable, but they all don't work...
Because the data was generated in python, I checked the data type using:
data.dtypes
and it returns
YearMonth object
Sales Count int64
dtype: object
I tried to convert it using
data['YearMonth'] = pd.to_datetime(data['YearMonth'])
but it converts everything to the first day of the month, i.e. the data now looks like:
YearMonth Sales Count
2010-04-01 300
2010-05-01 342
2010-06-01 425
Because the x-axis should be each month rather than the first day of each month, is there anyway to keep just the month and plot it as a numeric or datetime variable?
Many thanks!!
EDITS
Actually when I plot it in r, it only shows years like 2010, 2011 on the x-axis... So the issue above does not matter, if we can change what is shown on the x-axis. Is there a way to define what can be shown on the x-axis, like showing 2010 April, 2010 May, rather than just the year?
SOLUTION
Combining answers from @Jon Spring and @ThomasPepperz, the following codes give me exactly what I want:
data[['YearMonth']] = lubridate::ymd(paste(data[['YearMonth']], 1))
ggplot(stats8, aes(YearMonth, `Sales Count`)) +
geom_line() +
scale_x_date(date_breaks = "6 months",
date_labels = "%Y %b") +
theme(axis.text.x = element_text(angle=90, hjust=1))