I think I understand now; thank you for clarifying your question in the comments. If there's something I missed or you have any questions, please let me know.
Terminology, quickly
I believe you are interested in splitting a vector of strings into multiple shorter vectors of strings based on a pattern within each element. A list is simply a vector of vectors.
g
is a vector of 20 string elements (see Data code chunk below).
is.vector(g)
#> [1] TRUE
Here's a list that only contains one vector.
str(list(g))
#> List of 1
#> $ : chr [1:20] "New AS Plate 21_AS Plate_Sample 12_50.fcs" "New AS Plate 21_AS Plate_Sample 1_100.fcs" "New AS Plate 21_AS Plate_Sample 1_25.fcs" "New AS Plate 21_AS Plate_Sample 1_250.fcs" ...
Now onto the question...
In your question, you specifically ask about using assign()
. Although using assign()
can be convenient, [it is usually not recommended][1]. But sometimes you gotta do what you gotta do, no shame in that. Here's how you could use it manually, on one group at a time (like you show in your question).
# Using assign() one group at a time
h <- g[grep("Sample 1_", g)]
assign(x = "sample_1_group", value = h)
It is pretty straightforward (and seemingly logical) to use assign()
in a for-loop.
The first step in defining a for-loop, is defining what the loop will be "loop over." Or in other words, what will change during each iteration of the loop. In your case, we are looking for a number that defines your groups. We can define a vector of those numbers manually or programmatically.
# Define groups manually
ids <- c(12,1,10,11)
ids
#> [1] 12 1 10 11
# Pattern match groups
all_ids <- gsub(pattern = ".*Sample (\\d+).*", replacement = "\\1", x = g)
all_ids
#> [1] "12" "1" "1" "1" "1" "1" "10" "10" "10" "10" "10" "11" "11" "11" "11"
#> [16] "11" "12" "12" "12" "12"
ids <- unique(all_ids)
ids
#> [1] "12" "1" "10" "11"
After we know what we are looping over, we can define the structure of the loop and functions within in. paste0()
can be a workhorse here. The loop below iterates over ids (one id at a time), finds matching strings in g
, and writes them to your environment as a vector. Because we are using assign()
, we expect a new vector to appear in our environment after each iteration of the loop.
# For-loop with assign
for(i in ids){
a <- paste0("Sample ", i, "_")
h <- g[grep(a, g)]
h_name <- paste0("sample_", i, "_group")
assign(x = h_name, value = h)
}
That technically works, but it's not the best. You may find that it is actually more convenient to use lists (a vector of vectors) to store information from a for-loop. It's fast to program, you don't have a bunch of new objects crowding your workspace, and all the scary things (not really) in that link above won't be a problem. Here's how you could do that:
# Save the results of a for-loop in a list!
# First, make a blank list to hold the results
results <- list()
for(i in ids){
a <- paste0("Sample ", i, "_")
h <- g[grep(a, g)]
h_name <- paste0("sample_", i, "_group")
results[[h_name]] <- h
}
results
#> $sample_12_group
#> [1] "New AS Plate 21_AS Plate_Sample 12_50.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 12_100.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 12_25.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 12_250.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 12_500.fcs"
#>
#> $sample_1_group
#> [1] "New AS Plate 21_AS Plate_Sample 1_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 1_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 1_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 1_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 1_500.fcs"
#>
#> $sample_10_group
#> [1] "New AS Plate 21_AS Plate_Sample 10_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 10_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 10_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 10_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 10_500.fcs"
#>
#> $sample_11_group
#> [1] "New AS Plate 21_AS Plate_Sample 11_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 11_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 11_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 11_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 11_500.fcs"
Extra credit
For-loops are great: it's easy to see what's going on inside of them, its easy to do a lot of data handling in them, and they are usually reasonably fast to execute. But sometimes its all about speed. R is vectorized ([I'm honestly not exactly sure what this means][2] besides "it can do multiple calculations simultaneously"), but a for-loop doesn't take advantage of this very well. The apply()
family of vectorized functions do, and they can usually be easy to implement in cases where you might also use a for-loop. Here's how you could do that with your data:
# Vectorized
lapply(ids, function(i) g[grep(paste0("Sample ", i, "_"), g)])
#> [[1]]
#> [1] "New AS Plate 21_AS Plate_Sample 12_50.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 12_100.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 12_25.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 12_250.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 12_500.fcs"
#>
#> [[2]]
#> [1] "New AS Plate 21_AS Plate_Sample 1_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 1_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 1_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 1_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 1_500.fcs"
#>
#> [[3]]
#> [1] "New AS Plate 21_AS Plate_Sample 10_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 10_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 10_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 10_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 10_500.fcs"
#>
#> [[4]]
#> [1] "New AS Plate 21_AS Plate_Sample 11_100.fcs"
#> [2] "New AS Plate 21_AS Plate_Sample 11_25.fcs"
#> [3] "New AS Plate 21_AS Plate_Sample 11_250.fcs"
#> [4] "New AS Plate 21_AS Plate_Sample 11_50.fcs"
#> [5] "New AS Plate 21_AS Plate_Sample 11_500.fcs"
Created on 2021-10-14 by the reprex package (v2.0.1)
Data:
g <- c("New AS Plate 21_AS Plate_Sample 12_50.fcs",
"New AS Plate 21_AS Plate_Sample 1_100.fcs",
"New AS Plate 21_AS Plate_Sample 1_25.fcs",
"New AS Plate 21_AS Plate_Sample 1_250.fcs",
"New AS Plate 21_AS Plate_Sample 1_50.fcs",
"New AS Plate 21_AS Plate_Sample 1_500.fcs",
"New AS Plate 21_AS Plate_Sample 10_100.fcs",
"New AS Plate 21_AS Plate_Sample 10_25.fcs",
"New AS Plate 21_AS Plate_Sample 10_250.fcs",
"New AS Plate 21_AS Plate_Sample 10_50.fcs",
"New AS Plate 21_AS Plate_Sample 10_500.fcs",
"New AS Plate 21_AS Plate_Sample 11_100.fcs",
"New AS Plate 21_AS Plate_Sample 11_25.fcs",
"New AS Plate 21_AS Plate_Sample 11_250.fcs",
"New AS Plate 21_AS Plate_Sample 11_50.fcs",
"New AS Plate 21_AS Plate_Sample 11_500.fcs",
"New AS Plate 21_AS Plate_Sample 12_100.fcs",
"New AS Plate 21_AS Plate_Sample 12_25.fcs",
"New AS Plate 21_AS Plate_Sample 12_250.fcs",
"New AS Plate 21_AS Plate_Sample 12_500.fcs")
[1]: Why is using assign bad?)
[2]: How do I know a function or an operation in R is vectorized?