3

Before asking, I have searched and read many other questions on stackoverflow and tried the joins from dplyr package but none gives the result I want.

I have two dfs and want to combine them that first values in df1$col1 which is Wed is matched with first occurrence of Wed in df2$col3 then these two dfs are combined as shown in df3

df1 <- data.frame(col1 = c('Wed', 'Thu', 'Fri', 'Sat', 'Sun', 'Mon',
                           'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun',
                           'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat',
                           'Sun', 'Mon', 'Tue'),
                  col2 = c(102, 101, 100, 107, 91, 80, 80, 68, 65, 69,
                           69, 74, 66, 58, 59, 64, 76, 70, 73, 71, 60))

df2 <- data.frame(col3 = c('Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun',
                           'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 
                           'Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 
                           'Sat', 'Sun', 'Mon'),
                  col4 = c(107, 91, 70, 73, 71, 80, 80, 58, 59, 64, 68,
                           65, 102, 101, 100, 69, 69, 74, 66, 76, 60))

This is what I want to get!

df3 <- data.frame(col1 = c('Wed', 'Thu', 'Fri', 'Sat', 'Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun', 'Mon', 'Tue'), 
col2 = c(102, 101, 100, 107, 91, 80, 80, 68, 65, 69, 69, 74, 66, 58, 59, 64, 76, 70, 73, 71, 60),
col3 = c('Wed', 'Thu', 'Fri', 'Sat', 'Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun', 'Mon', NA),
col4 = c(91, 70, 73, 71, 80, 80, 58, 59, 64, 68, 65, 102, 101, 100, 69, 69, 74, 66, 76, 60, NA))

col1 col2 col3 col4
Wed  102  Wed   91
Thu  101  Thu   70
Fri  100  Fri   73
Sat  107  Sat   71
Sun   91  Sun   80
Mon   80  Mon   80
Tue   80  Tue   58
Wed   68  Wed   59
Thu   65  Thu   64
Fri   69  Fri   68
Sat   69  Sat   65
Sun   74  Sun  102
Mon   66  Mon  101
Tue   58  Tue  100
Wed   59  Wed   69
Thu   64  Thu   69
Fri   76  Fri   74
Sat   70  Sat   66
Sun   73  Sun   76
Mon   71  Mon   60
Tue   60  NA    NA

The important part is to get the df2 to drop the rows in df2$col3 until it matches the weekday in df1$col1

Solution: I used cbindX function from gdata package df3 = cbindX(df1, df2)

Zmnako Awrahman
  • 538
  • 7
  • 19

2 Answers2

4

We can do this in two steps:

Find the first matched entry of df2 to df1 and cbind them filling the empty entries with NA. There are a few ways to bind unequal length data frames. You can find a few here. I chose to use qpcr:::cbind.na()

i1 <- which(df1$col1[1] == df2$col3)[1]
qpcR:::cbind.na(df1, df2[i1:nrow(df2),])

which gives,

   col1 col2 col3 col4
2   Wed  102  Wed   91
3   Thu  101  Thu   70
4   Fri  100  Fri   73
5   Sat  107  Sat   71
6   Sun   91  Sun   80
7   Mon   80  Mon   80
8   Tue   80  Tue   58
9   Wed   68  Wed   59
10  Thu   65  Thu   64
11  Fri   69  Fri   68
12  Sat   69  Sat   65
13  Sun   74  Sun  102
14  Mon   66  Mon  101
15  Tue   58  Tue  100
16  Wed   59  Wed   69
17  Thu   64  Thu   69
18  Fri   76  Fri   74
19  Sat   70  Sat   66
20  Sun   73  Sun   76
21  Mon   71  Mon   60
1   Tue   60 <NA>   NA
Sotos
  • 51,121
  • 6
  • 32
  • 66
  • I tried these steps. it works only for this case. For exemple, if we randomly remove a date from df1 or df2, it doesn't work anymore. – Bechir Barkallah Jan 09 '18 at 09:17
  • Here is my exemple : df2 <- data.frame(col3 = c('Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun', 'Mon', 'Tue', 'Thu', 'Fri', 'Sat', 'Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun', 'Mon'), col4 = c(107, 91, 70, 73, 71, 80, 80, 58, 64, 68, 65, 102, 101, 100, 69, 69, 74, 66, 76, 60)) – Bechir Barkallah Jan 09 '18 at 10:19
  • 1
    Thanks, @Sotos, it worked. I have a custom function to do unequal cbind. – Zmnako Awrahman Jan 09 '18 at 12:12
1

You could try this :

library(dplyr)

new_df1 <- df1 %>% 
  mutate(row_index = row_number()) %>%
  arrange(col1,row_index) %>%
  group_by(col1) %>%
  mutate(week = row_number()) %>% 
  ungroup() %>% mutate(day = paste(col1, week, sep= "_")) %>% 
  arrange(row_index)
new_df2 <- df2 %>% 
  mutate(row_index = row_number()) %>%
  arrange(col3,row_index) %>%
  group_by(col3) %>%
  mutate(week = row_number()) %>% 
  ungroup() %>% mutate(day = paste(col3, week, sep= "_")) %>% 
  arrange(row_index)

df <- new_df1 %>%
  full_join(new_df2,by = c("day" = "day")) %>% 
  select(col1, col2, col3, col4)
Bechir Barkallah
  • 341
  • 1
  • 12