1

I have this data for 50 years in one csv file. And I want to divide the data into month-wise into 12 csv files. Example: One file will have only January data for all 50 years.

How do I read/prepare the data? I have to do it in R.

YEAR    Month   level 
1900    1   1.11
1900    2   1.64
1900    3   1.35
1900    4   4.26
1900    5   4.91
1900    6   0.62
1900    7   5.6
1900    8   2.12
1900    9   5.99
1900    10  4.74
1900    11  1.69
1900    12  0.39

Thanks!

Geekuna Matata
  • 1,349
  • 5
  • 19
  • 38
  • Do you need to do it in R? If using unix, just use the csplit command. – James King Apr 05 '14 at 00:19
  • I don't know R, but I would suggest that you read the lines modulus 12. So take the header line and store it into a variable. Then, for `i = line number % 12`, write to the i'th file. – alvonellos Apr 05 '14 at 00:20
  • Sorry, I have to do it in R . – Geekuna Matata Apr 05 '14 at 00:20
  • After reading you csv into R, use the `split` function to split it into a list of dataframes, which you can then write out to csvs. See for example http://stackoverflow.com/questions/9713294/split-data-frame-based-on-levels-of-a-factor-into-new-data-frames – James King Apr 05 '14 at 00:22
  • I have to extract it monthly before splitting. Please see the question. I have edited with more details. – Geekuna Matata Apr 05 '14 at 00:25
  • the `split` function will split it into 12 monthly pieces. – James King Apr 05 '14 at 00:29

2 Answers2

2

As suggested, you can use split and then recursively save choosing one of the looping constructs/functions (eg. a write.csv inside a simple for loop follows)

# After having imported your csv file you'll have 
# a data.frame similar to this one

my.df <- data.frame(year = rep(1900:2000, each = 12),
                    month = rep(1:12, 101),
                    level = rnorm(101*12))

# then

df.spl <- split(my.df, my.df$month)

for (i in names(df.spl)) {
    write.csv(df.spl[[i]],
              sprintf("month_%s.csv", i),
              row.names = FALSE)
}
Luca Braglia
  • 3,133
  • 1
  • 16
  • 21
0

Simple solution here.

jan <- new[new[,2]==1,]

Data

new=read.table( header=TRUE, text="YEAR    Month   level 
1900    1   1.11
        1900    2   1.64
        1900    3   1.35
        1900    4   4.26
        1900    5   4.91
        1900    6   0.62
        1900    7   5.6
        1900    8   2.12
        1900    9   5.99
        1900    10  4.74
        1900    11  1.69
        1900    12  0.39")
CCurtis
  • 1,770
  • 3
  • 15
  • 25