Suppose I have a dataframe like this, with w1
representing words and d1
, d2
, etc. representing durations in discourse:
set.seed(12)
df <- data.frame(
w1 = c(sample(LETTERS[1:4], 10, replace = T)),
d1 = c(rep(NA, 3), round(rnorm(7),3)),
d2 = c(round(rnorm(6),3), NA, round(rnorm(3),3)),
d3 = c(round(rnorm(2),3), rep(NA,2), round(rnorm(6),3)),
d4 = c(round(rnorm(1),3), NA, round(rnorm(8),3))
)
df
w1 d1 d2 d3 d4
1 D NA -0.043 0.314 -2.149
2 C NA -0.113 0.407 NA
3 A NA 0.457 NA 0.971
4 D -1.596 2.020 NA 1.145
5 C -0.309 -1.051 0.994 -0.525
6 D 0.449 0.735 0.856 0.250
7 A -0.977 NA 0.197 -0.429
8 A 0.190 0.539 0.834 -0.183
9 C 0.731 -1.314 0.847 -0.103
10 B -0.493 -0.250 1.954 -0.634
As d1
, d2
, etc. are in fact one and the same variable I'd like to concatenate them into a single vector. It can easily be done thus:
d <- c(df$d1, df$d2, df$d3, df$d4)
d
[1] NA NA NA -1.596 -0.309 0.449 -0.977 0.190 0.731 -0.493 -0.043 -0.113 0.457 2.020
[15] -1.051 0.735 NA 0.539 -1.314 -0.250 0.314 0.407 NA NA 0.994 0.856 0.197 0.834
[29] 0.847 1.954 -2.149 NA 0.971 1.145 -0.525 0.250 -0.429 -0.183 -0.103 -0.634
BUT: my real dataframe has many many such duration columns and concatenating them in this way is tedious. So I tried using the apply
family of functions. But the results are not what I want:
lapply(df[,2:5], c)
$d1
[1] NA NA NA -1.596 -0.309 0.449 -0.977 0.190 0.731 -0.493
$d2
[1] -0.043 -0.113 0.457 2.020 -1.051 0.735 NA 0.539 -1.314 -0.250
$d3
[1] 0.314 0.407 NA NA 0.994 0.856 0.197 0.834 0.847 1.954
$d4
[1] -2.149 NA 0.971 1.145 -0.525 0.250 -0.429 -0.183 -0.103 -0.634
sapply(df[,2:5], c)
d1 d2 d3 d4
[1,] NA -0.043 0.314 -2.149
[2,] NA -0.113 0.407 NA
[3,] NA 0.457 NA 0.971
[4,] -1.596 2.020 NA 1.145
[5,] -0.309 -1.051 0.994 -0.525
[6,] 0.449 0.735 0.856 0.250
[7,] -0.977 NA 0.197 -0.429
[8,] 0.190 0.539 0.834 -0.183
[9,] 0.731 -1.314 0.847 -0.103
[10,] -0.493 -0.250 1.954 -0.634
How must the code be changed to get me the desired result, shown in d
?