1

I have a data frame with two variables. The first variable are dates. The second variable is logical and specifies whether some statement is true or false for that day (e.g. "On that day it rained.").

I'd like to plot that data so that a curve dips more or less far from 1 (for TRUE) towards 0 (for FALSE), depending on the density of FALSE values in a time span. That is, the more FALSE values close to each other, the deeper the dip. Sort of like this:

sketch giving an example of the outcome I want to achieve


Sample data:

dat <- read.table(textConnection("
Var1 Var2 
2019-01-01 TRUE
2019-01-02 TRUE
2019-01-03 TRUE
2019-01-04 TRUE
2019-01-05 TRUE
2019-01-06 TRUE
2019-01-07 TRUE
2019-01-08 TRUE
2019-01-09 FALSE
2019-01-10 TRUE
2019-01-11 TRUE
2019-01-12 TRUE
2019-01-13 TRUE
2019-01-14 TRUE
2019-01-15 TRUE
2019-01-16 TRUE
2019-01-17 FALSE
2019-01-18 TRUE
2019-01-19 FALSE
2019-01-20 TRUE
2019-01-21 FALSE
2019-01-22 TRUE
2019-01-23 TRUE
2019-01-24 TRUE
2019-01-25 TRUE
2019-01-26 TRUE
2019-01-27 TRUE
2019-01-28 TRUE
2019-01-29 TRUE
2019-01-30 TRUE
2019-01-31 TRUE
2019-02-01 TRUE
2019-02-02 FALSE
2019-02-03 TRUE
2019-02-04 FALSE
2019-02-05 FALSE
2019-02-06 FALSE
2019-02-07 TRUE
2019-02-08 FALSE
2019-02-09 FALSE
2019-02-10 TRUE
2019-02-11 TRUE
2019-02-12 TRUE
2019-02-13 TRUE
2019-02-14 TRUE
2019-02-15 TRUE
2019-02-16 FALSE
2019-02-17 FALSE
2019-02-18 FALSE
2019-02-19 FALSE
2019-02-20 TRUE
2019-02-21 FALSE
2019-02-22 TRUE
2019-02-23 FALSE
2019-02-24 FALSE
2019-02-25 TRUE
2019-02-26 TRUE
2019-02-27 FALSE
2019-02-28 TRUE
2019-03-01 TRUE
"), header = TRUE, colClasses=c("Date", "logical"))
plot(dat)
  • 1
    you could plot the `rollmean` of `Var2` over the number of points you want (i.e. your time span) using `zoo` library. – denis Feb 11 '19 at 08:08
  • @denis Nice idea. Would you like to post an answer? –  Feb 11 '19 at 10:07

2 Answers2

2

Not sure what you are exactly looking for but here is an idea:

library(ggplot2)

ggplot(data = aggregate(Var2 ~ Var1, dat, FUN = mean), aes(x = Var1, y = Var2)) + 
  geom_smooth(se = FALSE, method = "loess") +
  geom_point(data = dat, aes(x = Var1, y = as.integer(Var2)), shape=1)

enter image description here

s_baldur
  • 29,441
  • 4
  • 36
  • 69
1

You can calculate the moving average, after converting your logicals to numeric (ma function copied from here), with ggplot:

library(dplyr)
library(ggplot2)

ma <- function(x,n=5){stats::filter(x,rep(1/n,n), sides=2)}

dat %>%
  mutate(
    Var3 = as.numeric(Var2),
    ma = ma(Var3, n = 6)
  ) %>%
  ggplot(aes(x = Var1, y = ma)) +
  geom_line() +
  geom_point(aes(y = Var3)) +
  ylim(0, 1)

You would be able to find a better moving average formula for longitudinal data.

enter image description here

Paul
  • 2,877
  • 1
  • 12
  • 28