1

I have a table A like this:

cou own ind aus_d_a usa_f_b
AUS D A 3268.02 85.087
AUS D B 92.1329 10808.3

and I want to reshape it to long type data as follows, table B:

cou own ind cou2 own2 ind2 value
AUS D A aus d a 3268.02
AUS D B aus d a 92.1329
AUS D A usa f b 85.087
AUS D B usa f b 10808.3

but I don't know how to code it using R or Stata? anyone can help me, thanks a lot

PS: the data is just a sample, actually I have thousands of columns (three dimensions: country_ownership_industry, eg aus_d_c21), 60 countries, 2 ownership, 34 industries, so I have 4080 cols.

Tao Zhang
  • 45
  • 4
  • This was closed as a duplicate. If you want a Stata answer. ask again, but focus on Stata. (It is not good psychology or good politics to ask for solutions in two or more languages at the same time.) – Nick Cox May 19 '22 at 14:10

2 Answers2

3

Provided your country_ownership_industry columns are the only ones with underscores in their column names, you can do:

library(tidyr)
library(dplyr)

pivot_longer(df, contains('_'), names_sep = '_', names_to = c('cou2', 'own2', 'ind2'))
#> # A tibble: 4 x 7
#>   cou   own   ind   cou2  own2  ind2    value
#>   <chr> <chr> <chr> <chr> <chr> <chr>   <dbl>
#> 1 AUS   D     A     aus   d     a      3268. 
#> 2 AUS   D     A     usa   f     b        85.1
#> 3 AUS   D     B     aus   d     a        92.1
#> 4 AUS   D     B     usa   f     b     10808. 
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
1

Using a combination of pivot_longer and separate:

# using your own example data
dat1 <- tibble(
  cou = c('AUS', 'AUS'), 
  own = c('D', 'D'), 
  ind = c('A', 'B'), 
  aus_d_a = c(3268.02, 92.1329), 
  usa_f_b = c(85.087, 10808.3)
)

library(tidyverse)

dat1 %>%
  pivot_longer(cols = aus_d_a:usa_f_b, names_to = 'cou2', values_to = 'value') %>%
  separate(cou2, c('cou2', 'own2', 'ind2'), sep = '_')
Harrison Jones
  • 2,256
  • 5
  • 27
  • 34