0

I have the following table:

  country   | country code |    variable     | 1990 | 1991 | 1992 | 1993 | ... 
 -----------|--------------|-----------------|------|------|------|------|----- 
  Argentina | ARG          | employment in a |  .40 |  .35 |  .37 |  .43 | ... 
  Argentina | ARG          | employment in b |  .35 |  .30 |  .32 |  .29 | ... 
  Argentina | ARG          | employment in c |  .25 |  .35 |  .31 |  .28 | ... 
-
  Brazil    | BRA          | employment in a |  .45 |  .35 |  .33 |  .44 | ... 
  Brazil    | BRA          | employment in b |  .30 |  .33 |  .34 |  .27 | ... 
  Brazil    | BRA          | employment in c |  .25 |  .32 |  .33 |  .29 | ... 
  ...       | ...          | ...             |  ... |  ... |  ... |  ... | ... 

but I need to change it to this format:

  country   | country code | year | employment in a | employment in b | employment in c 
 -----------|--------------|------|-----------------|-----------------|-----------------
  Argentina | ARG          | 1990 |             .40 |             .35 |             .25 
  Argentina | ARG          | 1991 |             .35 |             .30 |             .35 
  Argentina | ARG          | 1992 |             .37 |             .32 |             .31 
  Argentina | ARG          | 1993 |             .43 |             .29 |             .28 
  Argentina | ARG          |  ... |             ... |             ... |             ... 
-
  Brazil    | BRA          | 1990 |             .45 |             .30 |             .25 
  Brazil    | BRA          | 1991 |             .35 |             .33 |             .32 
  Brazil    | BRA          | 1992 |             .33 |             .34 |             .33 
  Brazil    | BRA          | 1993 |             .44 |             .27 |             .29 
  Brazil    | BRA          |  ... |             ... |             ... |             ...  
  ...       | ...          |  ... |             ... |             ... |             ...  

I have more variables, countries and years but the tables should illustrate the problem sufficiently.

I tried gather() and reshape() but was not able to solve my problem. Could someone help me with this problem?

Lila
  • 59
  • 6

1 Answers1

2

This has to be a duplicate, but while I find a suitable dupe target this can be solved using the tidyverse with a gather+spread call

library(tidyverse)
df %>% gather(year, val, -(1:3)) %>% spread(variable, val)
## A tibble: 8 x 6
#  country   country_code year  employment_in_a employment_in_b employment_in_c
#  <chr>     <chr>        <chr>           <dbl>           <dbl>           <dbl>
#1 Argentina ARG          1990             0.4            0.35            0.25
#2 Argentina ARG          1991             0.35           0.3             0.35
#3 Argentina ARG          1992             0.37           0.32            0.31
#4 Argentina ARG          1993             0.43           0.290           0.28
#5 Brazil    BRA          1990             0.45           0.3             0.25
#6 Brazil    BRA          1991             0.35           0.33            0.32
#7 Brazil    BRA          1992             0.33           0.34            0.33
#8 Brazil    BRA          1993             0.44           0.27            0.290

Sample data

df <- read_table2("country    country_code     variable      1990  1991  1992  1993
  Argentina  ARG           employment_in_a   .40   .35   .37   .43
  Argentina  ARG           employment_in_b   .35   .30   .32   .29
  Argentina  ARG           employment_in_c   .25   .35   .31   .28
  Brazil     BRA           employment_in_a   .45   .35   .33   .44
  Brazil     BRA           employment_in_b   .30   .33   .34   .27
  Brazil     BRA           employment_in_c   .25   .32   .33   .29 ")
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • The ``gather``+``spread``functions worked perfectly for this table but when I tried using it for another (very similar) table, the following error message pops up: "Each row of output must be identified by a unique combination of keys." What can I do to not get the error message? – Lila May 05 '19 at 22:33