How to duplicate a row based on the presence of multiple values in a column in R

Question

I have a dataframe with phonetic transcriptions of words called trans, and a column pos_numwhich records the position of the phoneme tin the transcription strings.

df <- data.frame(
  trans = c("ðət", "əˈpærəntli", "ˈkɒntrækt", "təˈwɔːdz", "pəˈteɪtəʊz"), stringsAsFactors = F
)
df$pos_num <- sapply(strsplit(df$trans, ""), function(x) which(grepl("t", x)))

df
       trans pos_num
1        ðət       3
2 əˈpærəntli       8
3  ˈkɒntrækt    5, 9
4   təˈwɔːdz       1
5 pəˈteɪtəʊz    4, 7

In some transcriptions, t occurs more than once, resulting in multiple values in pos_num. Where this is the case I would like to duplicate the entire row, with the original row containing one value and the duplicated row containing the other value. The desired output would be:

df
       trans pos_num
1        ðət       3
2 əˈpærəntli       8
3  ˈkɒntrækt       5
4  ˈkɒntrækt       9
5   təˈwɔːdz       1
6 pəˈteɪtəʊz       4
7 pəˈteɪtəʊz       7

How can this be achieved? (There seem to be a few posts on that question for other programming languages but not R.)

`tidyr::unnest(df, pos_num)` – Ronak Shah Oct 24 '20 at 07:58 — Ronak Shah, Oct 24 '20 at 07:58

score 1 · Accepted Answer · answered Oct 24 '20 at 08:06

1

library(data.table)
setDT(df)
df[, .(pos_num = unlist((pos_num))),by = .(trans)]

answered Oct 24 '20 at 08:06

Vasily A

8,256
10
42
76

How to duplicate a row based on the presence of multiple values in a column in R

1 Answers1