I have a data frame like this:
df<-structure(list(Date = structure(c(17605, 18179, 17605, 18508,
17626, 17837, 17963, 17900, 17823, 18008), class = "Date"), Patient = c("Doe, John",
"Doe, John", "Scissorhands, Edward", "Coleman, Ronnie", "Cuomo, Governor",
"Kent, Clark", "Wayne, Bruce", "Lane, Lois", "Gray, Dorian",
"Gray, Dorian"), MRN = c(12345, 12345, 54321,
65432, 765432, 9876, 87654, 111111,
101010, 101010), DOB = structure(c(5254, 5254, -1561,
-10629, 428, 3005, 3156, -2127, -5836, -5836), class = "Date"),
`ICD-10 Billed Procedure Code` = c("Detachment at Right Ring Finger, Mid, Open Approach",
"Detachment at Right Ring Finger, Low, Open Approach", "Detachment at Right Index Finger, Mid, Open Approach",
"Detachment at Left Hand, Complete 5th Ray, Open Approach",
"Detachment at Right Hand, Complete 5th Ray, Open Approach",
"Detachment at Right Little Finger, High, Open Approach",
"Detachment at Right Hand, Partial 1st Ray, Open Approach",
"Detachment at Left Index Finger, High, Open Approach", "Detachment at Right Index Finger, Low, Open Approach",
"Detachment at Left Index Finger, Low, Open Approach"), Admission = structure(c(0,
0, 0, 0, 1, 5, 2, 0, 0, 0), class = "difftime", units = "days")), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"), problems = structure(list(
row = 456L, col = "Discharge Date", expected = "date like %m/%d/%Y",
actual = "c", file = "'Patients.csv'"), row.names = c(NA,
-1L), class = c("tbl_df", "tbl", "data.frame")))
And I'd like to pivot it wider so that there is only one row per patient. I can do that with this code and it works beautifully on the first ten rows of my data (what you see above):
df<-df%>%pivot_wider(names_from = `ICD-10 Billed Procedure Code`,values_from = c(Date,Admission))
It creates a bunch of columns that have info for the date of when the patients had each surgery and how long they were admitted for each one. Thats fine. I'll organize all those new columns later into what I want.
My problem arises when I try to run it on all of the rows I have (497 total) and you can see it happen here when I add an 11th row.
df2<-structure(list(Date = structure(c(17605, 18179, 17605, 18508,
17626, 17837, 17963, 17900, 17823, 18008, 18008), class = "Date"),
Patient = c("Doe, John",
"Doe, John", "Scissorhands, Edward", "Coleman, Ronnie", "Cuomo, Governor",
"Kent, Clark", "Wayne, Bruce", "Lane, Lois", "Gray, Dorian",
"Gray, Dorian",
"Gray, Dorian"), MRN = c(12345, 12345, 54321,
65432, 765432, 9876, 87654, 111111,
101010, 101010, 101010), DOB = structure(c(5254,
5254, -1561, -10629, 428, 3005, 3156, -2127, -5836, -5836,
-5836), class = "Date"), `ICD-10 Billed Procedure Code` = c("Detachment at Right Ring Finger, Mid, Open Approach",
"Detachment at Right Ring Finger, Low, Open Approach", "Detachment at Right Index Finger, Mid, Open Approach",
"Detachment at Left Hand, Complete 5th Ray, Open Approach",
"Detachment at Right Hand, Complete 5th Ray, Open Approach",
"Detachment at Right Little Finger, High, Open Approach",
"Detachment at Right Hand, Partial 1st Ray, Open Approach",
"Detachment at Left Index Finger, High, Open Approach", "Detachment at Right Index Finger, Low, Open Approach",
"Detachment at Left Index Finger, Low, Open Approach", "Detachment at Right Index Finger, Low, Open Approach"
), Admission = structure(c(0, 0, 0, 0, 1, 5, 2, 0, 0, 0,
0), class = "difftime", units = "days")), row.names = c(NA,
-11L), class = c("tbl_df", "tbl", "data.frame"), problems = structure(list(
row = 456L, col = "Discharge Date", expected = "date like %m/%d/%Y",
actual = "c", file = "'Patients.csv'"), row.names = c(NA,
-1L), class = c("tbl_df", "tbl", "data.frame")))
And all of the cells that should be NA get NULL instead, and the date formats get all wonky.
Given I know roughly what causes it, I figure it must be the fact that two "Dorian Gray" rows shared the same "Procedure code". (I think? Tell me if I'm wrong) The thing is, that'll be the case a few times and I guess I'm ok with it? ULTIMATELY I'll condense my output to look like: "Patient, MRN, DOB, Number_of_Surgeries, Total_Admission_Time" but I figured I tackle this pivot_wider first before condensing the columns into "Number of surgeries" and "total admission time".
So... how can I fix that error?