-4

Can anyone help arranging long data to wide data, but complicated by linked results, ie listed in wide format identified by Study Number this repetitive results listed in wide format after the SN (I've shown an abbreviated table there are more results per patient listed along the bottom with repetitive in columns LabTest, LabDate, Result, Lower, Upper)...I've tried melt and recast, and binding columns but can't seem to get it to work. Over 1000 results to reformat so can't input results manually need to reformat a long data excel document in wide format in R Thank you

Original data looks like this

SN     LabTest     LabDate    Result Lower Upper
TD62   Creat       05/12/2004  22     30    90
TD62   AST         06/12/2004  652    6     45
TD58   Creat       26/05/2007  72     30    90
TD58   Albumin     26/05/2005  22     25    35  
TD14   AST         28/02/2007  234    6     45
TD14   Albumin     26/02/2007  15     25    35

Formatted data should look like this

SN LabTCode LabDate Result Lower Upper LabCode LabDate Result Lower Upper
TD62 Creat   05/12/04  22    30   90   AST     06/12/04  652   6    45
TD58 Creat   26/05/05  72    30   90   Alb     26/05/05  22    25   35
TD14 AST     28/02/07  92    30   90   Alb     26/02/07  15    25   35

Formatted data looks like this

So far I have tried:

data_wide2 <- dcast(tdl, SN + LabDate ~ LabCode, value.var="Result")

and

melt(tdl, id = c("SN", "LabDate"), measured= c("Result", "Upper", + "Lower"))
  • Hi, welcome to *Stack Overflow*, in order that we can help you, please provide example data and the steps you've tried so far. Consider [*How to make a great reproducible example*](https://stackoverflow.com/a/5963610/6574038), thank you. – jay.sf Feb 25 '18 at 21:03
  • are "linked results" anything special to R ? – Eugène Adell Feb 25 '18 at 21:21
  • Please do not provide the input as images since then no one can use it except by retyping the entire thing. Show the output of `dput(x)` in the question where `x` is the input. – G. Grothendieck Feb 25 '18 at 23:49

1 Answers1

0

Your issue is that R won't like the final table because it has duplicate column names. Maybe you need the data in that format but it's a bad way to store data because it would be difficult to put the columns back into rows again without a load of manual work.

That said, if you want to do it you'll need a new column to help you transpose the data.

I've used dplyr and tidyr below, which are worth looking at rather than reshape. They're by the same author but more modern and designed to fit together as part of the 'tidyverse'.

library(dplyr)
library(tidyr)

#Recreate your data (not doing this bit in your question is what got you downvoted)
df <- data.frame(
  SN = c("TD62","TD62","TD58","TD58","TD14","TD14"),
  LabTest = c("Creat","AST","Creat","Albumin","AST","Albumin"),
  LabDate = c("05/12/2004","06/12/2004","26/05/2007","26/05/2005","28/02/2007","26/02/2007"),
  Result = c(22,652,72,22,234,15),
  Lower = c(30,6,30,25,6,25),
  Upper = c(90,45,90,35,45,35),
  stringsAsFactors = FALSE
)

output <- df %>% 
  group_by(SN) %>% 
  mutate(id_number = row_number()) %>% #create an id number to help with tracking the data as it's transposed
  gather("key", "value", -SN, -id_number) %>% #flatten the data so that we can rename all the column headers
  mutate(key = paste0("t",id_number, key)) %>% #add id_number to the column names. 't' for 'test' to start name with a letter.
  select(-id_number) %>% #don't need id_number anymore
  spread(key, value)

  SN    t1LabDate  t1LabTest t1Lower t1Result t1Upper t2LabDate  t2LabTest t2Lower t2Result t2Upper
  <chr> <chr>      <chr>     <chr>   <chr>    <chr>   <chr>      <chr>     <chr>   <chr>    <chr>  
1 TD14  28/02/2007 AST       6       234      45      26/02/2007 Albumin   25      15       35     
2 TD58  26/05/2007 Creat     30      72       90      26/05/2005 Albumin   25      22       35     
3 TD62  05/12/2004 Creat     30      22       90      06/12/2004 AST       6       652      45 

And you're there, possibly with some sorting issues still to crack if you need the columns in a specific order.

NeilC
  • 188
  • 9
  • Thank you for your reply. The issue is I have over 1000 results in the large data set. Therefore manually adding in the results will not work. Is there a way round this? – Ruth Johnson Feb 26 '18 at 11:13
  • Do you mean the step where I recreated the data? You don't need to do that - I was writing a reproducible example where the code will run without loading your data. If you load your data (from wherever) as df, then the code from "output <- df %>%" will reshape it as you need. – NeilC Feb 26 '18 at 11:23
  • Thank you so much for your help! Works! really appreciate it, been struggling for over a week with this so far....Thank you – Ruth Johnson Feb 26 '18 at 12:27
  • Great to hear! Could you accept the answer and then the site will mark the question as done? – NeilC Feb 26 '18 at 14:35