1

I have the following toy dataset:

dat = data.frame(
        country = c("USA", "USA", "USA", "UK", "UK", "UK"),
        year = c(2000, 2001, 2002, 2000, 2001, 2002),
        apples.k = c(100, 60, 123, 340, 200, 235),
        pears.k = c(99, 88, 77, 22, 33, 44)
        )

The data look like this:

dat

  country year apples.k pears.k
1     USA 2000      100      99
2     USA 2001       60      88
3     USA 2002      123      77
4      UK 2000      340      22
5      UK 2001      200      33
6      UK 2002      235      44

However, I need to be able to call the dataset by dat[1] and obtain the following:

$USA

year   apples.k   pears.k
2000   100        99
2001   60         88
2002   123        77

... and the same with the UK (dat[2]):

$UK

year   apples.k   pears.k
2000   340        22
2001   200        33
2002   235        44

So, as I understand it, each entry in the new object should be a matrix of a subsystem of variables ("year", "apples.k", "pears.k"). And I have this "matrix of a subsystem of variables" for every country (US and UK).

Well, in reality, I have almost 300 years for every country in the world, and around 6 variables.

Thanks.

Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89

3 Answers3

3

There is a function for this, conveniently named split():

dat <- split(dat, dat$country)

> dat
$UK
  country year apples.k pears.k
4      UK 2000      340      22
5      UK 2001      200      33
6      UK 2002      235      44

$USA
  country year apples.k pears.k
1     USA 2000      100      99
2     USA 2001       60      88
3     USA 2002      123      77
Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
1

If you write a function, you may be able to achieve what you want without modifying dat

foo = function(n, x = dat, f = "country"){
    nm = unique(x[[f]])[n]
    setNames(list(subset(x, x[[f]] == nm)), nm)
}

foo(1)
#$USA
#  country year apples.k pears.k
#1     USA 2000      100      99
#2     USA 2001       60      88
#3     USA 2002      123      77
d.b
  • 32,245
  • 6
  • 36
  • 77
1

We can use group_split

library(dplyr)
dat %>%
    group_split(country)
#[[1]]
# A tibble: 3 x 4
#  country  year apples.k pears.k
#  <fct>   <dbl>    <dbl>   <dbl>
#1 UK       2000      340      22
#2 UK       2001      200      33
#3 UK       2002      235      44

[[2]]
# A tibble: 3 x 4
#  country  year apples.k pears.k
#  <fct>   <dbl>    <dbl>   <dbl>
#1 USA      2000      100      99
#2 USA      2001       60      88
#3 USA      2002      123      77
akrun
  • 874,273
  • 37
  • 540
  • 662