1

I am using R and I have the following data frame:

data <- data.frame(ID_CODE  = c('001', '001', '001', '002', '002', '003'),
  Metric1 = c('0.94', '0.68', '0.8', '0.12', '0.56', '0.87'))

I would like to create a loop that: (1) applies a subset to the data frame and (2) creates for each unique identifier in ID_CODE a separate data frame that includes all the rows pertaining to that identifier.

As a first step, I used pull to get all identifiers into a list:

# Get the identifiers
Identifiers <- dplyr::pull(data, ID_CODE)

Then, I tried to create the loop: I managed to subset, but when I tried to store each subset in a different dataframe named "temp_[Identifier]", it doesn't work (object 'temp_' not found)

for (i in unique(data$ID_CODE)) {
  temp_[[i]] <- subset(data, ID_CODE == i)
}

The resulting data frames should be:

temp_001 <- data.frame(ID_CODE  = c('001', '001', '001'),
  Metric1 = c('0.94', '0.68', '0.8'))

temp_002 <- data.frame(ID_CODE  = c('002', '002'),
  Metric1 = c('0.12', '0.56'))

temp_003 <- data.frame(ID_CODE  = c('003'),
  Metric1 = c('0.87'))

Can anybody help?

Rbeginner
  • 35
  • 5
  • I suspect you will find it far easier to store your derived data frames in a list rather than in standalone objects. Take a look at the online doc for `dplyr::group_split`. `dfList <- data %>% group_by(ID_CODE) %>% group_split()`. – Limey Mar 27 '23 at 09:43
  • See if it helps: https://stackoverflow.com/a/75854433/13323413 – Md Ahsanul Himel Mar 27 '23 at 10:02

1 Answers1

0

You easily can split your data frame into a list. To get the names as desired, you could use sprintf to get a prefix and a respective counter with three digits.

lst <- split(data, data$ID_CODE) |> setNames(sprintf('temp_%03d', seq_along(unique(data$ID_CODE))))
lst
# $temp_001
# ID_CODE Metric1
# 1     001    0.94
# 2     001    0.68
# 3     001     0.8
# 
# $temp_002
# ID_CODE Metric1
# 4     002    0.12
# 5     002    0.56
# 
# $temp_003
# ID_CODE Metric1
# 6     003    0.87

If you really want to pull the data frames from the list in the global environment (not recommended) you can do:

list2env(lst, .GlobalEnv)

ls()
# [1] "data"     "lst"      "temp_001" "temp_002" "temp_003"

Data:

data <- structure(list(ID_CODE = c("001", "001", "001", "002", "002", 
"003"), Metric1 = c("0.94", "0.68", "0.8", "0.12", "0.56", "0.87"
)), class = "data.frame", row.names = c(NA, -6L))
jay.sf
  • 60,139
  • 8
  • 53
  • 110