I have a dataframe with phonetic transcriptions of words called trans
, and a column pos_num
which records the position of the phoneme t
in the transcription strings.
df <- data.frame(
trans = c("ðət", "əˈpærəntli", "ˈkɒntrækt", "təˈwɔːdz", "pəˈteɪtəʊz"), stringsAsFactors = F
)
df$pos_num <- sapply(strsplit(df$trans, ""), function(x) which(grepl("t", x)))
df
trans pos_num
1 ðət 3
2 əˈpærəntli 8
3 ˈkɒntrækt 5, 9
4 təˈwɔːdz 1
5 pəˈteɪtəʊz 4, 7
In some transcriptions, t
occurs more than once, resulting in multiple values in pos_num
. Where this is the case I would like to duplicate the entire row, with the original row containing one value and the duplicated row containing the other value. The desired output would be:
df
trans pos_num
1 ðət 3
2 əˈpærəntli 8
3 ˈkɒntrækt 5
4 ˈkɒntrækt 9
5 təˈwɔːdz 1
6 pəˈteɪtəʊz 4
7 pəˈteɪtəʊz 7
How can this be achieved? (There seem to be a few posts on that question for other programming languages but not R.)