Join data table with different number of rows and column names

Question

I have two tables that I want to join with different number of rows and different column names(FzR/FzL):

a <- data.frame(FzR = c(130.8616, 130.9943, 131.1719, 131.3771, 131.6092, 131.8660, 132.1437, 132.4384, 132.7980, 133.2867), limb = c("L","L","L","L","L","L","L","L","L","L"), time = c(1.724178, 1.725178, 1.726178, 1.727178, 1.728178, 1.729178, 1.730178, 1.731178, 1.732179, 1.733179))                                                                                      

b <- data.frame(FzL = c(134.8616, 130.9943, 134.1719, 135.3771, 135.6092, 135.8660, 135.1437, 135.4384), limb = c("R","R","R","R","R","R","R","R"), time = c(2.724178, 2.725178, 2.726178, 2.727178, 2.728178, 2.729178, 2.730178, 2.731178))

I would like to join these two tables to look like this:

   Fz          limb         time 
130.8616        L            0
130.9943        L           0.01
131.1719        L           0.02
131.3771        L           0.03
131.6092        L           0.04
131.8660        L           0.05
132.1437        L           0.06
132.4384        L           0.07
132.7980        L           0.08
133.2867        L           0.09
134.8616        R           0.10
130.9943        R           0.11
134.1719        R           0.12
135.3771        R           0.13
135.6092        R            NA
135.8660        R            NA
135.1437        R            NA
135.4384        R            NA

My objective is to gather the FzR, FzL, and limb columns on top of each other, drop the "time" columns from the a and b dataframes and create a new time vector in the new dataframe, in order to graph Fz versus time according to limb.

Code I have tried:

rename_1 <- names(a)[names(a) == 'FzR'] <- 'Fz'
rename_2 <- names(b)[names(b) == 'FzL'] <- 'Fz'
rbind(rename_1,rename_2)

My thought process was to rename the FzR and FzL column names to a common name of Fz in order to perform an rbind, however this did not give me the correct output. Any help would be much appreciated, thank you.

@RicS After I join the table, I was going to mutate a new column time into the dataframe — mpvalenc, Jul 01 '20 at 08:08
I mean, what is the logic behind the values from 0 to 0.13? And why are there NAs in the last rows? — Ric S, Jul 01 '20 at 08:10
@RicS in my real dataset FzR is 6,822 rows and FzL is 7,447 rows making the Fz column once joined 14,269 rows. I need a time sequence from 0-10sec. I figured that time doesn't need to be 0-10 seconds in 14,269 rows. — mpvalenc, Jul 01 '20 at 08:16

Johann · Answer 1 · 2020-07-01T08:25:19.213

Your mistake was in the double assignment, as doing so you only save the name of the column and you "lose" the data:

rename_1 <- names(a)[names(a) == 'FzR'] <- 'Fz'
> rename_1
[1] "Fz"

You can just rename the first column of a and b and rbind them:

names(a)[1] = "Fz"
names(b)[1] = "Fz"
res = rbind(a,b)

Time sequence can be changed later with something like:

res$time = seq(from=0.01,by=0.01,length.out = nrow(res))

Final result:

         Fz limb time
1  130.8616    L 0.01
2  130.9943    L 0.02
3  131.1719    L 0.03
4  131.3771    L 0.04
5  131.6092    L 0.05
6  131.8660    L 0.06
7  132.1437    L 0.07
8  132.4384    L 0.08
9  132.7980    L 0.09
10 133.2867    L 0.10
11 134.8616    R 0.11
12 130.9943    R 0.12
13 134.1719    R 0.13
14 135.3771    R 0.14
15 135.6092    R 0.15
16 135.8660    R 0.16
17 135.1437    R 0.17
18 135.4384    R 0.18

How would I modify the time sequence code in order for time to restart at 0.000 when the first R in limb column comes? (i.e. row 11, time start at 0.00) — mpvalenc, Jul 01 '20 at 08:59
you need to create two vectors and combine them: res$time = c(seq(from=0,by=0.01,length.out = nrow(res[res$limb=="L",])), seq(from=0,by=0.01,length.out = nrow(res[res$limb=="R",])) ) — Johann, Jul 01 '20 at 09:04

Ric S · Accepted Answer · 2020-07-01T09:05:53.233

(Edited after discussion in the comments)

A dplyr would be something like

library(dplyr)

bind_rows(a, b) %>% 
  mutate(Fz = coalesce(FzR, FzL)) %>% 
  select(Fz, limb, time) %>% 
  group_by(limb) %>% 
  mutate(time = (seq_along(Fz)-1)*0.001)

In this way the newly created variable time will be a sequence of values from 0 to the number of rows for each limb, multiplied by a factor of 0.001 (so they will be milliseconds). For both limbs L and R time will start at 0.

Output

# A tibble: 18 x 3
# Groups:   limb [2]
      Fz limb   time
   <dbl> <chr> <dbl>
 1  131. L     0    
 2  131. L     0.001
 3  131. L     0.002
 4  131. L     0.003
 5  132. L     0.004
 6  132. L     0.005
 7  132. L     0.006
 8  132. L     0.007
 9  133. L     0.008
10  133. L     0.009
11  135. R     0    
12  131. R     0.001
13  134. R     0.002
14  135. R     0.003
15  136. R     0.004
16  136. R     0.005
17  135. R     0.006
18  135. R     0.007

This may be for another post, but I wanted to plot L and R limb on top of each other, since time associated with R is later and doesn't start at 0.000 but 0.010 I cannot do this. How would I change the time so that when the limb changes from L to R, time changes to 0.000 again? — mpvalenc, Jul 01 '20 at 08:44

Join data table with different number of rows and column names

2 Answers2