0

I am trying to convert my data from wide to long using tidyverse's gather command (I would prefer to avoid melting/casting.) However, my data has two matching rows that I want to use in the conversion.

 `2018 Data`  Codes   `Code1`  `Code2`   `Code3`           
1 ID          Names Code1Name Code2Name Code3Name 
2 1110        Crop…      0         0         0                       
3 1120        Anim…      0         0         0                       
4 1131        Timb…      0         0         0                                    

You'll notice that the data is strange in that there are two rows with variable names, but the second row is being treated as the first row of observations. The codes and codenames match, but I haven't found a way to do a proper gather command that matches them. Currently, I have something like df_tall<-gather(df_wide, key="Codes", value="CodeValue") However, this does not match the names properly.

The ideal data I want to get is below:

ID                    Names             Codes    CodeValue
1 1110        Crop…   Code1Name         Code1     0                          
2 1110        Crop…   Code2Name         Code2     0                         
3 1110        Crop…   Code3Name         Code3     0                        

Any ideas of how I can achieve this?

There is a dearth of questions on StackOverflow for converting from wide to long, but I didn't have luck finding any with a similar question (please correct me if I've overlooked something).

Yu Na
  • 112
  • 1
  • 18
  • Please make this question *reproducible*. This includes sample code you've attempted (including listing non-base R packages, and any errors/warnings received), sample *unambiguous* data (e.g., `dput(head(x))` or `data.frame(x=...,y=...)`), and intended output. Refs: https://stackoverflow.com/questions/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Mar 26 '20 at 16:36
  • 1
    Your first row does not seem to have any way to possibly add new values other than `names(df_wide)` here. Is there a reason why it is needed? ... I cannot see, for example, where any of the entries in your "long" example would have the values for "Names" and "Codes" not aligned with `Code1Name:Code1`, `Code2Name:Code2`, ... and so on – chemdork123 Mar 26 '20 at 16:42
  • 1
    I think "dearth" is not the right word ... searching SO for [`[r] wide to long`](https://stackoverflow.com/search?q=%5Br%5D+wide+to+long) currently lists over 4000 returns, with the most "relevant" (sorted) entries are quite relevant. – r2evans Mar 26 '20 at 16:43
  • 1
    I agree with @chemdork123. Perhaps when you read it in you should use `read.table(..., skip=1)` since it appears your data has a two-row header. – r2evans Mar 26 '20 at 16:45

1 Answers1

0

Try this:

DF <- data.frame(`2018 Data` = c("ID","1110","1120","1131"),
           Codes = c("Names","Crop..","Anim..","Timb.."),
           `Code1` = c("Code1Name","0","0","0"),
           `Code2` = c("Code2Name","0","0","0"),
           `Code3` = c("Code3Name","0","0","0"))

names(DF) <- unname(unlist(DF[1,]))

DF %>%
  filter(ID != "ID") %>%
  pivot_longer(starts_with("Code"), names_to = "Codes", values_to = "Value") %>%
  mutate(Code = sub("Name","",Codes)) %>%
  select(ID, Names, Codes, Code, Value) %>%
  as.data.frame()

Result

    ID  Names     Codes  Code Value
1 1110 Crop.. Code1Name Code1     0
2 1110 Crop.. Code2Name Code2     0
3 1110 Crop.. Code3Name Code3     0
4 1120 Anim.. Code1Name Code1     0
5 1120 Anim.. Code2Name Code2     0
6 1120 Anim.. Code3Name Code3     0
7 1131 Timb.. Code1Name Code1     0
8 1131 Timb.. Code2Name Code2     0
9 1131 Timb.. Code3Name Code3     0

PS: gather() function is retired, use pivot_longer() instead

nurandi
  • 1,588
  • 1
  • 11
  • 20