1

I am working in R with a dataframe of four columns where the first two columns describe a name and a date. The fourth column describes the number of seconds spent in a particular status, but where there are only a small number of statuses that may be selected from. The third column is the status which the fourth column describes.

What I would like to do is expand this to where each value in the fourth column occupies a separate column with the header name being the status of the third column, but in turn reduce the overall number of rows such that there is only ever a single row for each Name/Date pairing. This would also need to populate 0 seconds for any column for which that name/date combination did not have a status.

This is probably best demonstrated with a before-and-after example:

Name        Date           Status       Seconds
Person1     2021-07-26     Ready        1800
Person1     2021-07-26     Busy         900
Person1     2021-07-27     Ready        42
Person1     2021-07-27     Meeting      2700
Person2     2021-07-26     Meeting      1800
Person2     2021-07-26     Busy         900
Person2     2021-07-26     Project      900
Person2     2021-07-27     Busy         3600

and the desired output would look something like:

Name        Date            Ready     Busy     Meeting    Project
Person1     2021-07-26      1800      900      0          0
Person1     2021-07-27      42        0        2700       0
Person2     2021-07-26      0         900      1800       900
Person2     2021-07-27      0         3600     0          0

I just want a single row for each (Name & Date) combination. Above is a very simplified version, but there are only 10-20 available statuses, and people can use multiple statuses throughout a day for various amounts of time. I figure the order of the added columns does not matter and can be adjusted afterwards.

Is there some function in R which can handle something like this? Or should I be trying to set up a second dataframe with the unique elements of the first two columns, then adding zeros across the board, and doing something like parsing the original dataframe and adding into one of the newly added columns whereever it finds a match? I've searched awhile and it seems like there should be a more elegant solution than that, but I've yet to find it. I admit I am still somewhat novice with R as well. Any help or point in the right direction is hugely appreciated!

Skweezle
  • 21
  • 4
  • 4
    Looks like `pivot_wider` could do it. Take a look at the tutorial links in these answers https://stackoverflow.com/questions/58244373/how-to-reshape-this-data-frame-having-two-variables-as-identifiers/58250482#58250482 & https://stackoverflow.com/questions/59236281/how-to-change-the-shape-of-a-data-frame-in-r-stacking-columns-with-the-same-na/59239858#59239858 – Tung Jul 28 '21 at 22:42
  • Thank you! The pivot_wider function worked very well! – Skweezle Jul 28 '21 at 23:44

0 Answers0