R reshape wide to long using strings

Question

I have a table A like this:

cou	own	ind	aus_d_a	usa_f_b
AUS	D	A	3268.02	85.087
AUS	D	B	92.1329	10808.3

and I want to reshape it to long type data as follows, table B:

cou	own	ind	cou2	own2	ind2	value
AUS	D	A	aus	d	a	3268.02
AUS	D	B	aus	d	a	92.1329
AUS	D	A	usa	f	b	85.087
AUS	D	B	usa	f	b	10808.3

but I don't know how to code it using R or Stata? anyone can help me, thanks a lot

PS: the data is just a sample, actually I have thousands of columns (three dimensions: country_ownership_industry, eg aus_d_c21), 60 countries, 2 ownership, 34 industries, so I have 4080 cols.

This was closed as a duplicate. If you want a Stata answer. ask again, but focus on Stata. (It is not good psychology or good politics to ask for solutions in two or more languages at the same time.) — Nick Cox, May 19 '22 at 14:10

Allan Cameron · Answer 1 · 2022-05-19T12:53:10.973

3

Provided your country_ownership_industry columns are the only ones with underscores in their column names, you can do:

library(tidyr)
library(dplyr)

pivot_longer(df, contains('_'), names_sep = '_', names_to = c('cou2', 'own2', 'ind2'))
#> # A tibble: 4 x 7
#>   cou   own   ind   cou2  own2  ind2    value
#>   <chr> <chr> <chr> <chr> <chr> <chr>   <dbl>
#> 1 AUS   D     A     aus   d     a      3268. 
#> 2 AUS   D     A     usa   f     b        85.1
#> 3 AUS   D     B     aus   d     a        92.1
#> 4 AUS   D     B     usa   f     b     10808.

edited May 19 '22 at 12:53

answered May 19 '22 at 12:47

Allan Cameron

147,086
7
49
87

Thank you so much for the answer, Allan, it works so well. – Tao Zhang May 19 '22 at 12:57

score 1 · Answer 2 · answered May 19 '22 at 12:48

Using a combination of pivot_longer and separate:

# using your own example data
dat1 <- tibble(
  cou = c('AUS', 'AUS'), 
  own = c('D', 'D'), 
  ind = c('A', 'B'), 
  aus_d_a = c(3268.02, 92.1329), 
  usa_f_b = c(85.087, 10808.3)
)

library(tidyverse)

dat1 %>%
  pivot_longer(cols = aus_d_a:usa_f_b, names_to = 'cou2', values_to = 'value') %>%
  separate(cou2, c('cou2', 'own2', 'ind2'), sep = '_')

R reshape wide to long using strings

2 Answers2