I have a df with date under column "månad" in format 2020M07. When I plot this the x-axis gets all crowded and instead of plotting a continuous line I want to create a series per year and the x-axis only contain month.
In order to do this I need to have a col. in my df with only year so I can group on that variable (in ggplot), AND I also need to have a col. w only month (for x-data input in ggplot). How do I use the "månad" column to achieve this? Is there like in excel a LEFT-function or something that you can use with the dplyr function mutate? Or else how do I do this?
Maybe my idea isn't the best, feel free to answer both to my suggested solution and if you got a better one!
Thanks!
EDIT: - dump on the df in current state
Code:
library(pxweb)
library(tidyverse)
library(astsa)
library(forecast)
library(scales)
library(plotly)
library(zoo)
library(lubridate)
# PXWEB query
pxweb_query_list_BAS <-
list("Region"=c("22"),
"Kon" =c("1+2"),
"SNI2007" =c("A-U+US"),
"ContentsCode"=c("000005F3"))
# Download data
px_data_BAS <-
pxweb_get(url = "https://api.scb.se/OV0104/v1/doris/sv/ssd/START/AM/AM0210/AM0210B/ArbStDoNMN",
query = pxweb_query_list_BAS)
# Convert to data.frame
df_syss_natt <- as.data.frame(px_data_BAS, column.name.type = "text", variable.value.type = "text") %>%
rename(syss_natt = 'sysselsatta efter bostadens belägenhet') %>%
filter(månad >2020)
# Plot data
ggplot(df_syss_natt, aes(x=månad, y=syss_natt, group=1)) +
geom_point() +
geom_line(color="red")