If I have the following dataframe called data
year month id group returns
2016 2 asset_a group1 0.11592118
2016 3 asset_a group1 0.104526128
2016 4 asset_a group1 0.244925532
2016 5 asset_a group1 0.252377372
2016 6 asset_a group1 0.282602889
2016 7 asset_a group1 0.607148925
2016 8 asset_a group1 0.257815581
2016 9 asset_a group1 0.202712468
2016 10 asset_a group1 0.177455704
2016 11 asset_a group1 0.208526305
2016 12 asset_a group1 0.179808043
2017 1 asset_a group1 0.204425208
2017 2 asset_a group1 0.167787787
2017 3 asset_a group1 0.122357671
2017 4 asset_a group1 0.095889965
2017 5 asset_a group1 0.180117687
2017 6 asset_a group1 0.146912234
2017 7 asset_a group1 0.286743829
2017 8 asset_a group1 0.201531197
2017 9 asset_a group1 0.166819132
2017 10 asset_a group1 0.136262625
2017 11 asset_a group1 0.128844762
2017 12 asset_a group1 0.147595906
2018 1 asset_a group1 0.099843877
2018 2 asset_a group1 0.1928918
2018 3 asset_a group1 0.188344307
2018 4 asset_a group1 0.155801889
2018 5 asset_a group1 0.185813076
2018 6 asset_a group1 0.217531263
2018 7 asset_a group1 0.269840901
2018 8 asset_a group1 0.267351364
2018 9 asset_a group1 0.183753448
2018 10 asset_a group1 0.195182592
2018 11 asset_a group1 0.228886115
2018 12 asset_a group1 0.166964407
and in order to plot it in a heatmap I create a date vector with
data <- data %>%
mutate(date= make_datetime(year, month))
I get a database structure of
$ year : int [1:564] 2016 2016 2016 2016 2016 2016 2016 2016 2016 2016 ...
$ month : int [1:564] 2 2 2 2 2 2 2 2 3 3 ...
$ id : chr [1:564] "asset_a" "asset_b" "asset_c" "asset_d" ...
$ group : chr [1:564] "group1" "group2" "group3" "group4" ...
$ returns : num [1:564] 0.115 0.3 0.105 0.245 0.28 ...
$ date : POSIXct[1:564], format: "2016-02-01" "2016-02-01" "2016-02-01" "2016-02-01" ...
and inputting that into the ggplot heatmap
data %>%
ggplot(aes(x = date, y = asset)) +
geom_tile(aes(fill = returns)) +
theme_classic() +
scale_fill_gradientn(colours=c("#66bf7b", "#a1d07e", "#dce182",
"#ffeb84",
"#fedb81", "#faa075", "#faa075"),
values=rescale(c(-3, -2, -1,
0,
1, 2, 3)),
guide="colorbar") +
labs(x="",y="")
I get
Why did the ggplot create missing data out of nowhere, given that my data in the dataframe is without any monthly discontinuities? How can I fix it so that there are no white gaps in between the dates, is it related to hours and seconds in the date format?
If I plot the dates as characters I get the desired result, however, in that case, how can I reduce the number of ticks on the date axis to be readable?
UPDATE: The output according to stefan's suggestion didn't solve it because each asset id should have its own heatmap row. Right now, they are plotted on top of each other.
UPDATE 2
For me this didn't work
breaks <- sort(unique(as.numeric(factor(data$id)))) - .5
labels <- levels(factor(data$id))
Typing out manually:
mutate(xmin = date,
xmax = date + months(1),
ymin = case_when(
id == "asset_a" ~ 0,
id == "asset_b" ~ 1,
id == "asset_c" ~ 2,
id == "asset_d" ~ 3,
id == "asset_e" ~ 4,
id == "asset_f" ~ 5,
id == "asset_g" ~ 6,
id == "asset_h" ~ 7,
id == "asset_i" ~ 8,
),
ymax = case_when(
id == "asset_a" ~ 1,
id == "asset_b" ~ 2,
id == "asset_c" ~ 3,
id == "asset_d" ~ 4,
id == "asset_e" ~ 5,
id == "asset_f" ~ 6,
id == "asset_g" ~ 7,
id == "asset_h" ~ 8,
id == "asset_i" ~ 9)
)
solved the problem and each asset id is stacked on top of each other.