-1

Using R, I wish to combine rows (max of 2) per a common identifier from within the same dataset.

My dataset has an ID, Date and Text field.

df1 <- data.frame(ID = c("1", "1", "2", "3", "3"),
          Date = c("2017-02-12", "2017-03-12", "2017-02-14", "2016-02-12", "2014-02-12"),
          Text = c("Pie", "Cake", "Pie", "IceCream", "Chocolate"))

I want to combine rows based on the identifier without aggregating, for an end result that would double my column count.

Resulting in:

ID|Date.a     | Text.a  | Date.b     |Text.b  
1 |2017-02-12 | Pie     | 2017-03-12 | Cake  
2 |2017-02-14 | Pie     |            |  
3 |2017-02-14 | IceCream| 2017-02-12 |Chocolate  

Any help, much appreciated.

Jaap
  • 81,064
  • 34
  • 182
  • 193
Scottieie
  • 304
  • 8
  • 14
  • 2
    Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). This will make it much easier for others to help you. – Jaap Feb 24 '17 at 16:45
  • @Jaap clarified I hope. Perhaps you could offer some insight now? – Scottieie Feb 24 '17 at 17:25
  • 1
    I changed my original answer to attempt to better answer your question. Although I definitely recommend looking into the tidy data link I posted – tbradley Feb 24 '17 at 17:39
  • @Jaap, that looks so nice, with the even lines and all. How on earth did you do that? Looking around for a link to describe how to properly format isn't where I have looked apparently......my amature status is apparent isn't it, hence my 5, errr....3 tokens. – Scottieie Feb 25 '17 at 22:26
  • 1
    [see here](http://stackoverflow.com/editing-help) for how format you post properly – Jaap Feb 26 '17 at 11:30
  • Thank you @Jaap. That helps. – Scottieie Feb 27 '17 at 00:43

1 Answers1

1

It seems as though your want your data to become messier. Tidy Data is a pretty good resource in how data should typically be formatted in order to perform effective analyses.

Here is an example of a transformation you could make to view the data in the manner in which I think you are looking for while looking cleaner than what your requested output looks like.

library(tidyverse)
df1 <- data.frame(ID = c("1", "1", "2", "3", "3"),
              Date = c("2017-02-12", "2017-03-12", "2017-02-14", "2016-02-12", "2014-02-12"),
              Text = c("Pie", "Cake", "Pie", "IceCream", "Chocolate"))

df2 <- spread(df1, key = Text, value = Date)

> df2
  ID       Cake  Chocolate   IceCream        Pie
1  1 2017-03-12       <NA>       <NA> 2017-02-12
2  2       <NA>       <NA>       <NA> 2017-02-14
3  3       <NA> 2014-02-12 2016-02-12       <NA>
tbradley
  • 2,210
  • 11
  • 20
  • I probably should have made it more clear that it is within the same dataset that I wish to combine rows. I will work on making the question more clear. – Scottieie Feb 24 '17 at 17:07
  • In a sense I do want to make my data messier. The analysis has already happened and now I am doing some reporting. One of those looks at current vs previous records. – Scottieie Feb 24 '17 at 20:03
  • I appreciate your response to this, and it certainly would have solved a different problem. My goal was to join the two most current entries together for report purposes, side by side. My initial RODBC query pulled out the top two most recent rows per person, keying them 1 and if applicable, 2 (note that some didn't have a second record). I had that part figured out. I was flummoxed at how to join the top and bottom rows. Then separate them out into two tables using !grepl on the key field. Last step was to do a join. Thanks again for willingness to help. SW. – Scottieie Mar 11 '17 at 21:32