0

I am trying to make a long table wide, while creating unique variables to retain the granular detail i.e. combining variables with a sequence variable var1.seq1 var1.seq2

reshape seemed to be my saviour, but I keep coming across an undefined columns selected error.

n.b. for simplicity I haven't included the sample data with the full range of sequence numbers, however they do go up to 180.

Data sample available on github here


reshape(df, idvar = "MergeEncounterRecno", timevar = "Sequenceno", direction = "wide")

Error in [.data.frame(data, , timevar) : undefined columns selected

SecretAgentMan
  • 2,856
  • 7
  • 21
  • 41
Charlie R
  • 7
  • 2

2 Answers2

1

It looks like a typo. Try this:

#Code
dfres <- reshape(df, idvar = "MergeEncounterRecno", timevar = "SequenceNo", direction = "wide")
Duck
  • 39,058
  • 13
  • 42
  • 84
0

Here's an example using pivot longer:

Do you want to pivot the column NationalDiagnosis then use the 2nd pivot function transforming the values into characters (instead of numeric).

library(tidyverse)

df <- read_csv("https://raw.githubusercontent.com/Chazzer90/stackoverflowhelp2/main/SEQ_anom.csv")
#> Parsed with column specification:
#> cols(
#>   `<ef>..MergeRecno` = col_double(),
#>   MergeEncounterRecno = col_double(),
#>   SequenceNo = col_double(),
#>   DiagnosticSchemeCode = col_double(),
#>   DiagnosisCode = col_double(),
#>   DiagnosisSiteCode = col_character(),
#>   NationalDiagnosisCode = col_double(),
#>   NationalDiagnosis = col_character()
#> )

df %>% 
  mutate(DiagnosisSiteCode = as.integer(ifelse(DiagnosisSiteCode == "NULL", NA, DiagnosisSiteCode))) %>% 
  pivot_longer(cols = DiagnosticSchemeCode:NationalDiagnosisCode,
               names_to = 'variables', values_to = 'Values',
               values_drop_na = TRUE,
               names_ptypes = list(Values = integer()))
#> # A tibble: 134 x 6
#>    `\xef..MergeRec~ MergeEncounterR~ SequenceNo NationalDiagnos~ variables
#>               <dbl>            <dbl>      <dbl> <chr>            <chr>    
#>  1              402           545353          1 Muscle/tendon i~ Diagnost~
#>  2              402           545353          1 Muscle/tendon i~ Diagnosi~
#>  3              402           545353          1 Muscle/tendon i~ Diagnosi~
#>  4              402           545353          1 Muscle/tendon i~ National~
#>  5              758           261891          1 Cardiac conditi~ Diagnost~
#>  6              758           261891          1 Cardiac conditi~ Diagnosi~
#>  7              758           261891          1 Cardiac conditi~ National~
#>  8              894           941852          1 Respiratory con~ Diagnost~
#>  9              894           941852          1 Respiratory con~ Diagnosi~
#> 10              894           941852          1 Respiratory con~ Diagnosi~
#> # ... with 124 more rows, and 1 more variable: Values <dbl>

## do you want to pivot the column NationalDiagnosis
df %>% 
  mutate(DiagnosisSiteCode = as.integer(ifelse(DiagnosisSiteCode == "NULL", NA, DiagnosisSiteCode))) %>% 
  pivot_longer(cols = DiagnosticSchemeCode:NationalDiagnosis,
               names_to = 'variables', values_to = 'Values',
               values_drop_na = TRUE,
               values_transform = list(Values = as.character))

Created on 2020-10-21 by the reprex package (v0.3.0)

Ryan John
  • 1,410
  • 1
  • 15
  • 23