1

I'm trying to write a for loop that drops each column in a data frame and saves the modified data frame to a new variable.

This code illustrates what I would like the loop to perform

df1 = df[,-1]
df2 = df[,-2]
df3 = df[,-3]

#failed loop syntax 1 (unexpected "[" in "df[i] = df[,-[")

for (i in 1:3){
   df[i] = df[,-[i]]}

#failed loop syntax 2 (number of items to replace is not a multiple of replacement length)

for (i in 1:3){
   df[i] = df[,-i]}

Can anyone help me with this? This is an example to illustrate what I would like to achieve. The real data set contains 28 rows and 64 columns. I am trying to see how removing any one of the 64 columns affects the distribution of the 28 items in a K Cluster plot. I've tried PCA plots, but they are relatively useless with the 64 vectors.

EDIT:

slava-kohut's code (pasted below)worked perfectly for the first problem. Can anyone help me loop the output of the below code into a series of K cluster plots with the data input listed as the plot title?

for (i in 1:64){
  assign(paste0(deparse(substitute(mydata)),i),mydata[,-i])
}
Community
  • 1
  • 1
Scott_A
  • 13
  • 3
  • I'd suggest reading my answer at [How to make a list of data frames?](https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames/24376207#24376207) - generally variables named sequentially like that are hard to work with. You'll need to use `assign()` to create them, and `get()` to use them, and bugs are easy to make and hard to find. Can you give a broader context about *why* you want to do this, and perhaps we can recommend a better way? – Gregor Thomas Jun 03 '20 at 18:07

3 Answers3

0

If I understood you correctly, this is what you want:

for (i in 1:3){
  assign(paste0(deparse(substitute(mtcars)),i), mtcars[,-i])
}

assign will assign a variable in your current environment.

EDIT: modifying environments without understanding what exactly is going on can be dangerous and lead to bugs. Why do you want to achieve what you want to achieve?

slava-kohut
  • 4,203
  • 1
  • 7
  • 24
  • Awesome, this did exactly what I was looking for! I edited my initial description to include more details. I have a data set of 28 rows, 64 columns. I would like to generate K cluster plots for the 28 items after removing each column to see the affect of that column on the cluster plot. So I need to generate 64 cluster plots using matrices I just generated. Yes, I've tried PCA plots, but they were useless with the 64 vectors. Thank you for your help! – Scott_A Jun 03 '20 at 21:46
  • Glad it helped. Consider accepting or upvoting if you like the answer. You can generate plots one by one without creating 64 variables. – slava-kohut Jun 03 '20 at 23:27
0

Creating a minimal reproducible example data:

df <- data.frame(a=1:5, b=1:5, c=1:5)

Using 'lapply':

df_list <- lapply(1:3, function(x) df[, -x])

Returns:

[[1]]
b c
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5

[[2]]
a c
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5

[[3]]
a b
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
dario
  • 6,415
  • 2
  • 12
  • 26
0

We can also use sapply

sapply(1:3, function(x) df[, -x])
akrun
  • 874,273
  • 37
  • 540
  • 662