1

I have 2 data frames with 85 different IDs, and 40 different groups, so I'd like to use a loop as it is too much data to go through by hand.

Here's what I need to do:

I want to merge the two data frames by ID for each group, make a new column that multiplies Data X Area, then extract separate csvs by each group. Each ID in each group has a different area, but number of IDs is the same per group.

Here's the dummy data:

Group <- c(100,100,100,103,103,103,110,110,110)
ID <- c(1,2,3,1,2,3,1,2,3)
Area <- c(23,4,3,23,0,.5,7,2,33)

x <- data.frame(Group, ID, Area)

ID <- c(1,2,3)
Data <- c(.002,.4,1)

y <- data.frame(ID, Data)

Desired Output (different csv for each group):

Group    ID    Area    Data    A.D
 100      1     23      .002   .046
 100      2     4       .4     1.6
 100      3     3        1      3
kslayerr
  • 819
  • 1
  • 11
  • 21
  • have a look at `?merge` and `?split` – HubertL Nov 17 '16 at 20:44
  • 2
    is there are group id for `y`, how do you plan to determine which id in `y` matches to which id in `x`? – Nate Nov 17 '16 at 20:45
  • @NathanDay - no, there isn't... I was hoping to extract by `group` first, then merge the `ID`. As the `group` is tied to the unique `area` – kslayerr Nov 17 '16 at 20:46
  • @HubertL - I looked at split... But can I do something in a loop instead of splitting them manually? – kslayerr Nov 17 '16 at 20:47
  • What is your desired output? – Jan Nov 17 '16 at 20:49
  • 1
    so lets say we are talking about group100, how do we know which `y$Data` value to use when there are 3 that match to `ID = 1` – Nate Nov 17 '16 at 20:52
  • A nice resource: [How to join (merge) data frames (inner, outer, left, right)?](http://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) – Jaap Nov 17 '16 at 20:53
  • @NathanDay - you are right, I typed the data wrong. Each `ID` in `y` is unique. Thank you for catching that! – kslayerr Nov 17 '16 at 20:57
  • for your last step - you can just split the joined data frame by group into a list of dataframes using `split` and then use `sapply` and `write.csv` to output all of them at once. – vagabond Nov 18 '16 at 00:53

2 Answers2

1

If I got your question right, you might be able to do sth. like:

library(dplyr)
merge(x, y, by = c("ID")) %>% 
  mutate(A.D = Area * Data) %>%
  arrange(Group)

Which yields:

  ID Group Area  Data    A.D
1  1   100 23.0 0.002  0.046
2  2   100  4.0 0.400  1.600
3  3   100  3.0 1.000  3.000
4  1   103 23.0 0.002  0.046
5  2   103  0.0 0.400  0.000
6  3   103  0.5 1.000  0.500
7  1   110  7.0 0.002  0.014
8  2   110  2.0 0.400  0.800
9  3   110 33.0 1.000 33.000
Jan
  • 42,290
  • 8
  • 54
  • 79
  • That does work! I'll try it with the large dataset, thanks Jan. Do you know if there is a quick way to extract different CSVs by `Group`? – kslayerr Nov 17 '16 at 21:20
  • Have a look at `group_by()` from the `dplyr` package. – Jan Nov 17 '16 at 21:28
1

This is how you could do it with base:

1) Get all of your data together in z and perform your new column calculation:

z <- merge(x, y)
z$data_x_aread <- z$Data * z$Area

2) Now split z by group so it is a list of 3 data.frames:

df_list <-  split(z, Group)

3) Grab the element names from df_list (these are really just the group names). And make valid path strings with paste0. We will store these in a character vector:

vector_of_output_paths <- paste0("~/Desktop/", names(df_list), ".csv")

4) Take advantage of mapply and use it to write the csv's sourcing the data.frames to write from df_list and the respective file path's for each from vector_of_output_paths:

mapply(write.csv, df_list, vector_of_output_paths)
Nate
  • 10,361
  • 3
  • 33
  • 40
  • 1
    It works! Though I made a small edit here: `vector_of_output_paths <- paste0("", names(df_list), ".csv")` For some reason, the files were being names Desktop100, Desktop102, etc. because of that argument, and they only need to be named by group. Thanks for your help. – kslayerr Nov 18 '16 at 14:16
  • 1
    you could just remove that first term "" in `paste0`, perhaps the original was missing the "/" after Desktop causing ti behave like that. Either way as long as they are outputting to a location you can find, happy coding – Nate Nov 18 '16 at 16:36