1

So I have data that is really wide, and so I am reshaping the data to long so that it can be analyzed, the data is not hierarchical and its really way to complicated to give you guys a working example so I know that it is going to be tough to answer.

Anyway, I need to reshape it three consecutive times.

longdata = reshape(widedata,direction="long",varying=Issue,idvar = "Issue.ID")
longdata = reshape(longdata,direction="long",varying=Resolution)
longdata= reshape(longdata,direction="long", varying=Equipment)

It works the first two times, and the data that is being reshaped in the third line is set up the exact same way as the first two, so its not that there is something weird with that vector, i can change the order and it still throws this error on the third reshape.

Error in `row.names<-.data.frame`(`*tmp*`, value = paste(ids, times[i],  : 
duplicate 'row.names' are not allowed  

I have tried getting rid of the row names like so:

longdata = reshape(widedata,direction="long",varying=Issue,idvar = "Issue.ID")
rownames(longdata) <- NULL
longdata = reshape(longdata,direction="long",varying=Resolution)
rownames(longdata) <- NULL
longdata= reshape(longdata,direction="long", varying=Equipment)  

but still get the same error. what do I need to do in order for this to work?

EDIT*

I'm gonna try and give some sample data, it will probably be a really long post now, sorry.

Issue.ID = c("CBICR1Q2201704000", "CBICR1Q2201704001", 
"CBICR1Q2201704002", "CBICR1Q2201704003", "CBICR1Q2201704004", 
"CBICR1Q2201704005", "CBICR1Q2201704006", "CBICR1Q2201704007", 
"CBICR1Q2201704008", "CBICR1Q2201704009", "CBICR1Q2201704010", 
"CBICR1Q2201704011", "CBICR1Q2201704012", "CBICR1Q2201704013", 
"CBICR1Q2201704014", "CBICR1Q2201704015", "CBICR1Q2201704016", 
"CBICR1Q2201704017", "CBICR1Q2201704018", "CBICR1Q2201704019")
Issue.1 = c("Difficulty receiving products in general", 
"Supplier compliance issues", "Supplier fraud, waste, or abuse", 
"Difficulty receiving products in general", "Difficulty receiving products in general", 
"Supplier fraud, waste, or abuse", "Supplier service issues", 
"Problems repairing due to service issues ", "Problems repairing due to service issues ", 
"Other", "Billing, coverage, coordination of benefits", "Problems repairing due to service issues ", 
"Difficulty receiving products in general", "Difficulty receiving products in general", 
"Low quantity/quality", "Difficulty receiving products in general", 
"Difficulty receiving products in general", "Supplier service issues", 
"Problems repairing due to service issues ", "Problems repairing due to service issues ")
Issue.2 = c("Supplier compliance issues", "Billing, coverage, coordination of benefits", 
"Supplier service issues", "Supplier service issues", "Low quantity/quality", NA, "DMEPOS information issues", "Supplier fraud, waste, or abuse", 
"Supplier compliance issues", NA, "DMEPOS information issues", 
"Supplier compliance issues", "Supplier compliance issues", "Supplier service issues", 
"Supplier service issues", "Supplier service issues", "Supplier service issues", 
"DMEPOS information issues", NA, "Supplier compliance issues")

Equipment.1 = c("Oxygen Supplies/Equipment", 
"Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD)", 
"Nebulizers", "Lifts", "Oxygen Supplies/Equipment", "Walking Aids", 
"Power Mobility Devices (PMDs) other than scooter", "Power Mobility Devices (PMDs) other than scooter", 
"Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD)", 
"Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD)", 
"Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD)", 
"Walking Aids", "Hospital beds", "Power Mobility Devices (PMDs) other than scooter", 
"Oxygen Supplies/Equipment", "Hospital beds", "Oxygen Supplies/Equipment", 
"Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD)", 
"Power Mobility Devices (PMDs) other than scooter", "Power Mobility Devices (PMDs) other than scooter"
)
Equipment.2 = c(NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_, 
NA_character_)

Resolution.1 = c("Current supplier resolved the issue", 
"Current supplier resolved the issue", "Current supplier resolved the issue", 
"Supplier educated about inquiry\n", "Beneficiary educated about inquiry ", 
"Supplier educated about inquiry\n", "Beneficiary educated about DMEPOS\n", 
"Beneficiary educated about inquiry ", "Beneficiary educated about inquiry ", 
"Beneficiary educated about inquiry ", "Beneficiary educated about suppliers", 
"The case unresolved ", "The case unresolved ", "Beneficiary educated about DMEPOS\n", 
"Current supplier resolved the issue", "Current supplier resolved the issue", 
"Beneficiary educated about DMEPOS\n", "Beneficiary educated about suppliers", 
"New supplier found ", "Beneficiary educated about suppliers"
)
Resolution.2 = c(NA, NA, NA, "Current supplier resolved the issue", 
NA, "Reimbursement or refund ", "Supplier educated about DMEPOS_x000D_\n", 
"Beneficiary educated about suppliers", "Beneficiary educated about DMEPOS\n", 
"Current supplier resolved the issue", "New supplier found ", 
"Beneficiary educated about DMEPOS\n", NA, "Beneficiary educated about suppliers", 
"Beneficiary educated about inquiry ", "Supplier educated about inquiry_x000D_\n", 
"Beneficiary educated about inquiry ", "New supplier found ", 
NA, "Supplier educated about inquiry\n")

widedata<-data.frame(Issue.ID,Issue.1,Issue.2,Resolution.1,Resolution.2,Equipment.1,Equipment.2)
Issue <- c("Issue.1","Issue.2")
Equipment <- c("Equipment.1","Equipment.2")
Resolution <- c("Resolution.1","Resolution.2")
  • You're correct, it is tough to answer without example data. Can you give it a try? – neilfws Oct 16 '17 at 20:58
  • I added some sample data – Travers Woodward Oct 16 '17 at 21:44
  • Are you just looking for `reshape(widedata, direction="long", varying=2:ncol(widedata), idvar = "Issue.ID")`? – A5C1D2H2I1M1N2O1R2T1 Oct 17 '17 at 03:58
  • To provide some clarity, I originally reshaped issue, equipment, and Resolution all with one reshape. but, since I want to do visulaizations that will associate every single issue, with every single resolution, with every single equipment: `mydata <- subset(longdata, !is.na(Resolution)) mydata <- subset(mydata, !is.na(Issue)) ggplot(data=mydata, aes(Issue, fill = Resolution)) + geom_bar()+coord_flip()` I have to do each reshape separately to make sure every value is counted properly. If i can just get rid of this error the data will be in the form that I want. – Travers Woodward Oct 17 '17 at 13:00
  • Does my solution not do what you're describing? – Mako212 Oct 17 '17 at 14:42

1 Answers1

0

I think we can do this with the data.table package and melt. It looks to me like Issue, Equipment, and Resolution all go together, so we define the meas argument using RegEx patterns to aggregate everything correctly. Values just renames the melted columns.

By doing this, we end up with a single column for Issue, Equipment, and Resolution respectively, and each Issue.1, Issue.2, etc. become the rows.

require(data.table)


setDT(widedata)

 df1 <- melt(widedata, id="Issue.ID", meas =patterns("^Issue\\.\\d+", "^Equipment.*", "^Resolution.*"), 
  value= c("Issue", "Equipment", "Resolution"))[order(Issue.ID)]

head(df1)

            Issue.ID variable                                       Issue                                                                    Equipment                          Resolution
1: CBICR1Q2201704000        1    Difficulty receiving products in general                                                    Oxygen Supplies/Equipment Current supplier resolved the issue
2: CBICR1Q2201704000        2                  Supplier compliance issues                                                                           NA                                  NA
3: CBICR1Q2201704001        1                  Supplier compliance issues Continuous Positive Airway Pressure (CPAP) / Respiratory Assist Device (RAD) Current supplier resolved the issue
4: CBICR1Q2201704001        2 Billing, coverage, coordination of benefits                                                                           NA                                  NA
5: CBICR1Q2201704002        1             Supplier fraud, waste, or abuse                                                                   Nebulizers Current supplier resolved the issue
6: CBICR1Q2201704002        2                     Supplier service issues                                                                           NA                                  NA
> 

Also, note that melt is originally a reshape2 function, but data.table has implemented a version with more features, and in this case we're leveraging the ability to define patterns.

Mako212
  • 6,787
  • 1
  • 18
  • 37
  • Updated from initial solution – Mako212 Oct 16 '17 at 21:55
  • Unfortunately this solution just associates the first issue with the first resolution. In my data it is possible to have 4 different Issues, 4 different Resolutions, and 4 different equipment selections, all within the same "Issue.ID". They are not hierarchical so I believe the method I am attempting above is the only way. can you possibly explain what error It is giving me? – Travers Woodward Oct 17 '17 at 15:08
  • so your solution is not much different that this `Columns <- c(Issue , Equipment , Resolution) ##turns table from wide to long longdata = reshape(widedata,direction="long",varying=Columns,idvar="Issue.ID")` – Travers Woodward Oct 17 '17 at 15:13
  • @TraversWoodward Can you provide a sample of what you want the output to look like? I don't understand these solutions fall short. In your above code, and in mine, for each ID you'll have one row for each Equipment|Issue|Resolution set. So if you have 4 different issues, you'll have four different rows, which is the definition of having your data in "long" format – Mako212 Oct 17 '17 at 15:59
  • if you run the code I provided in the initial question, the `longdata` set is correct for issues and resolution. Meaning that every issue is associated with every resolution so that I can create this ggplot `ggplot(data=longdata, aes(Issue, fill = Resolution)) + geom_bar()+coord_flip()` in yours and my code above, the first issue will only be with the first resolution, so when you have multiple resolutions, or multiple issues, correlations will be missed. there may be a better way to create this graph, which I would be more than open too – Travers Woodward Oct 17 '17 at 17:59
  • I really just don't get why the third line won't reshape, why can't I have duplicate Row names, or how do i remedy this – Travers Woodward Oct 17 '17 at 19:43
  • @TraversWoodward how about `longdata <- reshape(widedata,direction="long", varying=c("Issue.1", "Issue.2","Equipment.1","Equipment.2","Resolution.1", "Resolution.2"), idvar="Issue.ID")` – Mako212 Oct 17 '17 at 20:14
  • thats exactly what my previous comment does, i just named `c("Issue.1", "Issue.2","Equipment.1","Equipment.2","Resolution.1", "Resolution.2") <- Columns`--- the issue is i need to to do the reshapes separately and iteratively – Travers Woodward Oct 17 '17 at 21:13