-3

I have big data for many years. I would like to split data from each year separately using R.

REC_NUM YEAR    LOC2    REP TRT PLOT#   HYBRID  FEMALE  MALE    Combine GWAS    LO_CODE KC  %M  MwVOL   MwFSH
95384   1996    B02 1   167 1026    HW109R  75-514  71-760  75-514-71-760   X   8   81  16.5    3275    1
95414   1996    B02 2   167 2167    HW109R  75-514  71-760  75-514-71-760   X   8   83  15.2    3300    1
95387   1996    B05 1   212 1052    HW109R  75-514  71-760  75-514-71-760   X   8   82  15.4    3175    1
95415   1996    B05 2   212 2011    HW109R  75-514  71-760  75-514-71-760   X   8   88  15.8    3075    1
95361   1996    B06 1   37  1005    HW109R  75-514  71-760  75-514-71-760   X   2   92  15.2    3275    1
95391   1996    B06 2   37  2024    HW109R  75-514  71-760  75-514-71-760   X   2   76  15.3    3300    1
95389   1996    B07 1   236 1150    HW109R  75-514  71-760  75-514-71-760   X   9   98  16  3350    1
95417   1996    B07 2   236 2082    HW109R  75-514  71-760  75-514-71-760   X   9   74  14.5    3450    1
95373   1996    B08 1   57  1013    HW109R  75-514  71-760  75-514-71-760   X   7   78  16.3    3250    1
95402   1996    B08 2   57  2017    HW109R  75-514  71-760  75-514-71-760   X   7   89  15.8    3400    1
95364   1996    B10 1   41  1040    HW109R  75-514  71-760  75-514-71-760   X   4   85  15.5    3125    1
95371   1996    B10 1   45  1039    HW109R  75-514  71-760  75-514-71-760   X   4   79  15.1    3325    1

the data from 1996 to 2011 and are not in balance. it means that different REC_NUM (first column) in different years. how can I do using r? thanks in advance

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294

1 Answers1

1

As @Gregor mentioned:

 df <- data.frame(YEAR = c("2001","2001","2002","2002"), REC_NUM = c(95384, 95414, 95387, 95415))
 split(df, f = df$YEAR)

Not sure what do you mean by "not in balance".

Edgar Santos
  • 3,426
  • 2
  • 17
  • 29
  • thanks for guiding me. I looked to split command and I did as follow h <- read.table(file.choose(), header=T).... s <- split(h, h$YEAR).... SO, now how can I get separate files? each file contains data from one year. – Ahmed Sallam Mar 15 '17 at 22:15
  • "not in balance" means that I have different rows in each year – Ahmed Sallam Mar 15 '17 at 22:20
  • You can do it manually using `s1 <- s[[1]]` or use `lapply` [http://stackoverflow.com/questions/9713294/split-data-frame-based-on-levels-of-a-factor-into-new-data-frames] – Edgar Santos Mar 15 '17 at 22:29
  • Thank you this is very helpful. I appreciate your time. – Ahmed Sallam Mar 15 '17 at 22:37