How could I split a data.frame?

Question

I have 50 synoptic stations precipitation data from 1986 to 2015.

I need to sort the related information for the period of years from 2007 to 2015 for each station separately. I mean there are three variables:

the station's name
the specific year
the amount of precipitation

I need the result for each station separately. Does anyone know how to use "split" for this purpose? May you please write codes from the beginning "read.table"?

Maybe this answer can help you: [How to split a data frame](https://stackoverflow.com/a/3302671/4969485) — J.Hpour, Apr 30 '20 at 05:31
Can you provide some illustrative and reproducible data? And what do you mean by "result"--the summed precipitation, the averaged precipitation? — Chris Ruehlemann, Apr 30 '20 at 09:13

score 0 · Accepted Answer · answered Apr 30 '20 at 09:26

0

If your task is simply to split the dataframe by year you can use split:

split(df, f = df$year)

Illustrative data:

(set.seed(123)
df <- data.frame(
  station = sample(LETTERS[1:3],10, replace = T),
  year = paste0("201", sample(1:9, 10, replace = T)),
  precipitation = sample(333:444, 10, replace = T)
)

Result:

$`2011`
  station year precipitation
5       C 2011           406
8       C 2011           399

$`2013`
  station year precipitation
7       B 2013           393
9       B 2013           365

$`2015`
  station year precipitation
2       C 2015           410

$`2016`
  station year precipitation
4       C 2016           444

$`2017`
  station year precipitation
3       B 2017           404

$`2019`
   station year precipitation
1        A 2019           432
6        A 2019           412
10       B 2019           349

answered Apr 30 '20 at 09:26

Chris Ruehlemann

20,321
4
12
34

Dear friend, the result is exactly what I want, but unfortunately, I can't understand the numbers you wrote: (set.seed(123) df <- data.frame( station = sample(LETTERS[1:3],10, replace = T), year = paste0("201", sample(1:9, 10, replace = T)), precipitation = sample(333:444, 10, replace = T) ) – Maryam Apr 30 '20 at 15:52
The numbers are just **any** numbers, intended merely as illustrative material to show how the code works. So to make the code work for your data please adapt the code. The relevant line of code for you is this small bit: `split(df, f = df$year)`. By the way, if the code helps you with your problem, please consider accepting the answer by clicking the tick in the left upper corner. Thanks in advance! – Chris Ruehlemann Apr 30 '20 at 17:07
yes, I know I need to change numbers according to my own data. May you please clarify "201",1:9, and 10 refer to? year = paste0("201", sample(1:9, 10, replace = T))? – Maryam Apr 30 '20 at 17:55
That's just a quick way to generate some numbers that look like years: `paste0` is a function to paste strings together; here we string together "201" plus a randomly selected number between 1 and 9, thus yielding years like 2011, 2017, 2013, etc. – Chris Ruehlemann Apr 30 '20 at 21:05

How could I split a data.frame?

1 Answers1