0

I have a rather inconvenient data set that looks like this:

sample_df <- tibble::tribble(
                              ~project,     ~01_2020,   ~05_2021,    ~07_2020   
                                "Mark",       "1000",     "7000",       "N/A",     
                                "Boat",        "N/A",     "8300",       "N/A",       
                            "Delorean",       "8400",     "1050",       "N/A",      
                               "Parix",        "N/A",      "N/A",      "4100",
)

The columns, whose names represent months, contain sums for the given projects in the given months. What I need is two new columns: c("time", "sum"). The first column should represent months if there was an entry in the months 01.2020 through 10.2021, and the column "sum" should contain the respective entries' sums. The desired output should look like this:

 project      time     sum 
  Mark     "01_2020"   1000
  Mark     "05_2021"   7000
  Boat     "05_2021"   8300
Delorean   "01_2020"   1050
  Parix    "07_2020"   4100
Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
user000
  • 123
  • 7
  • 1
    Hi @user000. This is in essence a simple reshape from wide to long, with some minor post-processing & cleaning (removing all `"N/A"` rows etc.). Please note that your sample data is not reproducible and throws an error. – Maurits Evers Nov 18 '21 at 12:41
  • 1
    After fixing your input data, `sample_df %>% pivot_longer(-project, names_to = "time", values_to = "sum") %>% filter(sum != "N/A")` does the job. You probably want to convert entries in the `sum` column to `numeric`. – Maurits Evers Nov 18 '21 at 12:43

0 Answers0