-1

I know this is rather simple for R users, but struggled to do this simple task:

I have this dataframe:

data

set.seed(123)
df <- data.frame(ID_series=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18), 
             year=c(2010,2011,2012,2010,2011,2012,2010,2011,2012,2010,2011,2012,
                    2010,2011,2012,2010,2011,2012), 
             IDPlot=c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2), 
             value=runif(18, 0.0, 1.0))

I would need:

  1. to split the dataframe by IDPlot
  2. rearrange it so that year is in rows and ID_series in columns
  3. save separate dataframes named by IDPlot in csv.

Thank you for suggestions! Michal

naRuser
  • 13
  • 3
  • Please provide your expected output. Have you tried anything? This might be simple with `data.table::dcast` or `tidyr::spread`. – r2evans Dec 08 '19 at 12:37
  • ...the output shoud be separate matrices [year,ID_series] by IDPlots and stored in separate csv files named by IDPlots......is this enough info for you? Thank you very much for your efforts... – naRuser Dec 08 '19 at 12:43
  • Sorry, that is not clear enough. Do you mean something like this for one of them? `structure(list(year = c(2010, 2011, 2012), \`1\` = c(1, 1, 1), \`2\` = c(1, 1, 1), \`3\` = c(1, 1, 1)), class = "data.frame", row.names = c(NA, -3L))` Also, *what have you tried*? – r2evans Dec 08 '19 at 12:45
  • 1
    Does this answer your question? [Split dataframe into multiple output files](https://stackoverflow.com/questions/10002021/split-dataframe-into-multiple-output-files) – MDEWITT Dec 08 '19 at 12:46
  • @MDEWITT, I think you're overlooking the reshaping portion of the question (though the split/write thing is probably correct). – r2evans Dec 08 '19 at 12:47
  • sorry, I forgot to add one more variable in df: df <- data.frame(ID_series=c(1,1,1,2,2,2,3,3,3,1,1,1,2,2,2,3,3,3), year=c(2010,2011,2012,2010,2011,2012,2010,2011,2012,2010,2011,2012,2010,2011,2012,2010,2011,2012), IDPlot=c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2), value=runif(18, 0.0, 1.0)).............here, the value should go to matrix cells – naRuser Dec 08 '19 at 12:53
  • naRuser, I think RonakShah's answer is half of your problem, and MDEWITT's link answers the other aspect of your question, both are covered by the duplicates' links. If they are not enough, then update your question to include your expected answer, perhaps adding why the linked dupes are not right. Thanks! – r2evans Dec 08 '19 at 13:11

1 Answers1

2

You could convert the data into wide format and then use group_split to split it into different dataframes.

library(tidyverse)

out <- df %>%
         pivot_wider(names_from = ID_series, values_from = value) %>%
         group_split(IDPlot) 

out
#[[1]]
# A tibble: 3 x 5
#   year IDPlot   `1`    `2`   `3`
#  <dbl>  <dbl> <dbl>  <dbl> <dbl>
#1  2010      1 0.288 0.883  0.528
#2  2011      1 0.788 0.940  0.892
#3  2012      1 0.409 0.0456 0.551

#[[2]]
# A tibble: 3 x 5
#   year IDPlot   `1`   `2`    `3`
#  <dbl>  <dbl> <dbl> <dbl>  <dbl>
#1  2010      2 0.457 0.678 0.900 
#2  2011      2 0.957 0.573 0.246 
#3  2012      2 0.453 0.103 0.0421

Or other way round to first split and then convert every dataframe to wide format.

out <- df %>%
         group_split(IDPlot) %>%
         map(~pivot_wider(., names_from = ID_series, values_from = value))

If you want to write them in separate csvs, you can do

lapply(seq_along(out), function(x) 
      write.csv(out[[x]], paste0('df', x, '.csv', row.names = FALSE)))

data

set.seed(123)
df <- data.frame(ID_series=c(1,1,1,2,2,2,3,3,3,1,1,1,2,2,2,3,3,3), 
             year=c(2010,2011,2012,2010,2011,2012,2010,2011,2012,2010,2011,2012,
                    2010,2011,2012,2010,2011,2012), 
             IDPlot=c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2), 
             value=runif(18, 0.0, 1.0))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • ....thank you for the codes...It is a good option, however, in my original dataset (I did not provided it as it includes over 21 ths rows), the names of ID_series changes among the IDPlots. So I would need an option in which the "split" would go first, then the splitted dataframes would be reshaped to have matrices [year, ID_series] and finally these reshaped matrices would be stored into new separate dataframes. Thanks again to all that provided ideas! – naRuser Dec 09 '19 at 07:25
  • @naRuser Check the updated answer. It first splits and then converts it into wide format. Is that what you want ? – Ronak Shah Dec 09 '19 at 07:28
  • ...yes, this works, perfect! many thanks! – naRuser Dec 09 '19 at 07:35