1

Given a data frame with a variable that contains an unpredictable number of records, how do I gather() this into a long form dataset? If all entries in a column had the same number of pieces of information, I would split them into multiple columns first and then gather, but in this case, some rows have just one value for this variable, but others have arbitrarily many, with a regular separator between.

In searching, I found someone who accomplishes the opposite of this question, see here:

https://markhneedham.com/blog/2015/06/27/r-dplyr-squashing-multiple-rows-per-group-into-one/

In other words, my desired transformation is:

1  Andy Roddick      2009
2  David Nalbandian  2005
3  Grigor Dimitrov   2014
4  Marcos Baghdatis  2006
5  Rafael Nadal      2011, 2010, 2008
6  Roger Federer     2012

Into this:

   winner            years
1  Andy Roddick      2009
2  David Nalbandian  2005
3  Grigor Dimitrov   2014
4  Marcos Baghdatis  2006
5  Rafael Nadal      2011
6  Roger Federer     2012
7  Rafael Nadal      2010
8  Rafael Nadal      2008

Where I know the separating character, but don't have an upper bound on how many years may appear in each row of the first dataframe. Is there a way to get gather() to do this?

0 Answers0