1

I have a data set having 3 columns Part, Claimid and Cost. The dataset looks as below:

Part  Claimid Cost
Part1 ID1 12
Part1 ID20 29
Part2 ID21 21
Part2 ID40 13
Part3 ID41 11
Part3 ID60 10

The cost column is a random number between 1 to 10 I am trying to run a loop for every Part (here 3 parts) and use dplyr package to create three distinct data frames

library(dplyr)
claimid <- read.csv(file.choose(),header = TRUE)
plist <- unique(claimid$Part) ##Create the number of loops (Here 3)
for (i in plist) {
    plist <- claimid %>% select(Part,Claimid) %>% filter(Part %in% i)
}

I am getting the last 20 observations when I print plist because obviously R is saving the last observation of loop.

user438383
  • 5,716
  • 8
  • 28
  • 43
ARIMITRA MAITI
  • 303
  • 3
  • 4
  • 14

1 Answers1

1

We need to create a list to store the output if we are using the for loop. It is better to keep the data.frames in a list and not as three separate data.frame objects.

plist <- unique(claimid$Part) 
lst <- setNames(vector("list", length(plist)), plist)
 for (i in seq_along(plist)) {
   lst[[i]] <- claimid %>%
                  select(Part,Claimid) %>% 
                  filter(Part %in% plist[i])
}

But, this can be done more directly with lapply

lst1 <- lapply(plist, function(nm) claimid %>%
                                      select(Part, Claimid) %>%
                                      filter(Part %in% nm)
                      )

However, if we need to create three different data.frame objects, assign is the option (but not recommended)

for (i in plist) {
       assign(i, claimid %>% select(Part,Claimid) %>% filter(Part %in% i))
  }


Part1
#   Part Claimid
#1 Part1     ID1
#2 Part1    ID20

Part2
#   Part Claimid
#1 Part2    ID21
#2 Part2    ID40

 Part3
#   Part Claimid
#1 Part3    ID41
#2 Part3    ID60
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Many Thanks "akrun". I got to learn seq_along function and assign function. Thank you for helping beginners like me. God bless. – ARIMITRA MAITI Aug 20 '16 at 08:36
  • Hi akrun, If I use this code claimid <- read.csv(file.choose(),header = TRUE) df <- data.frame(Part = character(),totcost = integer(),claim = integer(),stringsAsFactors = FALSE) plist <- unique(claimid$Part) for (i in plist) { assign(i, claimid %>% filter(Part %in% i) %>% group_by(Part) %>% summarise(totcost = sum(Cost), claim = n_distinct(Claimid))) rbind(df,i) } Would I be able to append the datasets Part1, Part2 and Part3 together in a single dataset df? – ARIMITRA MAITI Aug 20 '16 at 09:11
  • @ARIMITRAMAITI If you started with a single dataset, I am not sure why you are splitting it to different datsets. – akrun Aug 20 '16 at 11:46