I am working with Proteomic data and testing differences between versions of the analysis software. We are wanting to have a table that lets us know in what versions of the software the proteins appear.
Below is a simplified version of the data table I currently have:
Version Protein.ID Protein name
1.1 A name 1
1.2 A name 1
1.1 B name 2
1.2 B name 2
I want my table to look like this:
Version Protein.ID Protein name
1.1, 1.2 A name 1
1.1, 1.2 B name 2
I have been looking for 2 days on here and the web and can not find a solution.
I have tried using spread, and aggregate but neither worked. I either got a huge number of columns or a single column lacking the information I was after. I tried using some base R commands like paste but could not get rid of duplicate values.
Example of something I tried:
allver.mergeVerID <- spread(allver.ids, Protein.ID, Ver.ID.Porder)
Error: Each row of output must be identified by a unique combination of keys.
Keys are shared for 5311 rows:
I also get this error using
allver.mergeVerID <- allver.ids %>% group_by(Protein.ID) %>%
summarise(Ver.ID.Porder= toString(Ver.ID.Porder), )
OR
allver.mergeVerID <- aggregate(Ver.ID.Porder ~ Protein.ID, allver.ids, toString)
What does this error mean?