0

I am coding in R.

I have a table like :

region;2012;2013;2014;2015
1;2465;245;2158;645
2;44;57;687;564
3;545;784;897;512
...

And I want to transform it into :

region;value;annee
1;2465;2012
1;245;2013
1;2158;2014
1;645;2015
2;44;2012
...

Do you know how I can do it ?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
celianou
  • 207
  • 4
  • 25
  • Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – Sotos Oct 12 '17 at 14:09
  • The first table has what is called "wide format", the later has "long format". Searching for these terms you will find a large number of different approaches within R, as this questions comes up often times and different package authors took different approaches. Try this link: https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format – Bernhard Oct 12 '17 at 14:12
  • Thanks a lot, with your explanation wide and long I found an answer on internet :) – celianou Oct 12 '17 at 14:17

2 Answers2

2

First, read the data:

dat <- read.csv2(text = "region;2012;2013;2014;2015
1;2465;245;2158;645
2;44;57;687;564
3;545;784;897;512",
                 check.names = FALSE)

The data frame con be converted into the long format with gather from package tidyr.

library(tidyr)
dat_long <- gather(dat, key = "annee", , -region)

The result:

   region annee value
1       1  2012  2465
2       2  2012    44
3       3  2012   545
4       1  2013   245
5       2  2013    57
6       3  2013   784
7       1  2014  2158
8       2  2014   687
9       3  2014   897
10      1  2015   645
11      2  2015   564
12      3  2015   512

You can also produce the ;-separated result of your question:

write.csv2(dat_long, "", row.names = FALSE, quote = FALSE)

This results in:

region;annee;value
1;2012;2465
2;2012;44
3;2012;545
1;2013;245
2;2013;57
3;2013;784
1;2014;2158
2;2014;687
3;2014;897
1;2015;645
2;2015;564
3;2015;512
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
1

An example to answer the question :

olddata_wide
#>   subject sex control cond1 cond2
#> 1       1   M     7.9  12.3  10.7
#> 2       2   F     6.3  10.6  11.1
#> 3       3   F     9.5  13.1  13.8
#> 4       4   M    11.5  13.4  12.9

library(tidyr)

# The arguments to gather():
# - data: Data object
# - key: Name of new key column (made from names of data columns)
# - value: Name of new value column
# - ...: Names of source columns that contain values
# - factor_key: Treat the new key column as a factor (instead of character vector)
data_long <- gather(olddata_wide, condition, measurement, control:cond2, factor_key=TRUE)
data_long
#>    subject sex condition measurement
#> 1        1   M   control         7.9
#> 2        2   F   control         6.3
#> 3        3   F   control         9.5
#> 4        4   M   control        11.5
#> 5        1   M     cond1        12.3
#> 6        2   F     cond1        10.6
#> 7        3   F     cond1        13.1
#> 8        4   M     cond1        13.4
#> 9        1   M     cond2        10.7
#> 10       2   F     cond2        11.1
#> 11       3   F     cond2        13.8
#> 12       4   M     cond2        12.9
celianou
  • 207
  • 4
  • 25