Merge Df together R

Question

I am trying to create a data frame which contains a) a time span of 51 days (time of the Corona Lockdown) and b) calculated frequencies of Tweets in this time span. The Problem is that not every day there had been tweeted, so there are dates missing in the frequency table. But in order to continue and calculate some correlations I would need a data frame which has a values/missing value for every single day of the time span. How can I achieve this? Is there any other way to calculate the frequencies? Or any way to bind the data together?

LockdownDays <-  seq.Date(from = as.Date('2020-03-19'), to = as.Date('2020-05-08'), by = 'days') ##this is the date vector that contains all the dates I need values for

frequencyD <- table (ThemaD$date) ##This is the calculated frequencies from the Tweets dataset

As I said, the problem is that:

They are of different length
The frequency value has to match the right date in the LockdownDays vector.

So in the End I want a dataframe with the Dates on the x axis, the frequencies on y. If there is no frequency for the day I still want there to be a date in the dataframe, best would be with 0 or NA for the y value.

df <- c("2020-01-02", "2020-01-03", "2020-01-03", "2020-01-05")
freq <- table (df)
dates <-  seq.Date(from = as.Date('2020-01-01'), to = as.Date('2020-01-08'), by = 'days') 
print (dates)
cbind (freq, df)

Welcome to [so]! Can you please give a [mre] in your question? — jogo, Sep 08 '20 at 13:14
If you need some help, [please check this post](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) example. For example, to produce a minimal data set, you can use `head()`, `subset()`. Then use `dput()` to give us something that can be put in R immediately. Alternatively, you can use base R datasets such as `mtcars`, `iris`, *etc*. — Paul, Sep 08 '20 at 13:16
Also, have you checked functions from `dplyr`? Such as `left_join()`, `right_join()`, `inner_join()`, `full_join()`? See ?`mutate-joins` for further info. — Paul, Sep 08 '20 at 13:22
Tried them, but was creating an error. Also was not quite sure which function to choose — Rennacker54, Sep 08 '20 at 13:33
Try `dput(head(ThemaD$date,30))`. The picture you just showed is about a `Var1` and a `Freq` that are not even in the question. Also: Pictures are a bad way to communicate rows of numeric data... — Bernhard, Sep 08 '20 at 13:34
```> dput(head(ThemaD$date,30)) structure(c(18341, 18343, 18343, 18344, 18344, 18344, 18345, 18345, 18345, 18345, 18348, 18350, 18351, 18352, 18357, 18357, 18358, 18358, 18358, 18360, 18360, 18360, 18360, 18362, 18362, 18362, 18362, 18362, 18362, 18362), class = "Date") ``` I don't see how this solves the problem, sorry, I am a complete newbie — Rennacker54, Sep 08 '20 at 14:04

alex_jwb90 · Accepted Answer · 2020-09-08T14:26:24.767

0

You can put your dates and frequency table into dataframes and then use dplyr::left_join to achieve what you want:

library(dplyr)

# OP's data
LockdownDays <-  seq.Date(from = as.Date('2020-03-19'), to = as.Date('2020-05-08'), by = 'days')
ThemaD <- tibble(
    date = structure(c(18341, 18343, 18343, 18344, 18344, 18344, 18345,  18345, 18345, 18345, 18348, 18350, 18351, 18352, 18357, 18357,  18358, 18358, 18358, 18360, 18360, 18360, 18360, 18362, 18362,  18362, 18362, 18362, 18362, 18362), class = "Date") 
)
frequencyD <- table(ThemaD$date)

left_join solution here:

df <- tibble(
    date = LockdownDays
  ) %>%
  left_join(
    as.data.frame(frequencyD) %>%
      mutate(Var1 = as.Date(Var1)),
    by = c(date = 'Var1')
  )

edited Sep 08 '20 at 14:26

answered Sep 08 '20 at 14:05

alex_jwb90

1,663
1
11
20

Error: Can't join on `x$date` x `y$date` because of incompatible types. i `x$date` is of type >>. i `y$date` is of type >. – Rennacker54 Sep 08 '20 at 14:10
have you changed `LockdownDays` in any way (is it a dataframe/tibble now)? The way is is generated in your post, it will be of the same type. – alex_jwb90 Sep 08 '20 at 14:20
have copied in the data you supplied in your example - works for me – alex_jwb90 Sep 08 '20 at 14:27
Worked, thank you. The LockdownDays had been in a wrong format. – Rennacker54 Sep 08 '20 at 14:42

Merge Df together R

1 Answers1