R - dropping columns sequentially in for loop

Question

I'm trying to write a for loop that drops each column in a data frame and saves the modified data frame to a new variable.

This code illustrates what I would like the loop to perform

df1 = df[,-1]
df2 = df[,-2]
df3 = df[,-3]

#failed loop syntax 1 (unexpected "[" in "df[i] = df[,-[")

for (i in 1:3){
   df[i] = df[,-[i]]}

#failed loop syntax 2 (number of items to replace is not a multiple of replacement length)

for (i in 1:3){
   df[i] = df[,-i]}

Can anyone help me with this? This is an example to illustrate what I would like to achieve. The real data set contains 28 rows and 64 columns. I am trying to see how removing any one of the 64 columns affects the distribution of the 28 items in a K Cluster plot. I've tried PCA plots, but they are relatively useless with the 64 vectors.

EDIT:

slava-kohut's code (pasted below)worked perfectly for the first problem. Can anyone help me loop the output of the below code into a series of K cluster plots with the data input listed as the plot title?

for (i in 1:64){
  assign(paste0(deparse(substitute(mydata)),i),mydata[,-i])
}

I'd suggest reading my answer at [How to make a list of data frames?](https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames/24376207#24376207) - generally variables named sequentially like that are hard to work with. You'll need to use `assign()` to create them, and `get()` to use them, and bugs are easy to make and hard to find. Can you give a broader context about *why* you want to do this, and perhaps we can recommend a better way? — Gregor Thomas, Jun 03 '20 at 18:07

slava-kohut · Accepted Answer · 2020-06-03T18:17:43.450

0

If I understood you correctly, this is what you want:

for (i in 1:3){
  assign(paste0(deparse(substitute(mtcars)),i), mtcars[,-i])
}

assign will assign a variable in your current environment.

EDIT: modifying environments without understanding what exactly is going on can be dangerous and lead to bugs. Why do you want to achieve what you want to achieve?

edited Jun 03 '20 at 18:17

answered Jun 03 '20 at 18:11

slava-kohut

4,203
1
7
24

Awesome, this did exactly what I was looking for! I edited my initial description to include more details. I have a data set of 28 rows, 64 columns. I would like to generate K cluster plots for the 28 items after removing each column to see the affect of that column on the cluster plot. So I need to generate 64 cluster plots using matrices I just generated. Yes, I've tried PCA plots, but they were useless with the 64 vectors. Thank you for your help! – Scott_A Jun 03 '20 at 21:46
Glad it helped. Consider accepting or upvoting if you like the answer. You can generate plots one by one without creating 64 variables. – slava-kohut Jun 03 '20 at 23:27

score 0 · Answer 2 · answered Jun 03 '20 at 18:13

Creating a minimal reproducible example data:

df <- data.frame(a=1:5, b=1:5, c=1:5)

Using 'lapply':

df_list <- lapply(1:3, function(x) df[, -x])

Returns:

score 0 · Answer 3 · answered Jun 03 '20 at 18:29

0

We can also use sapply

sapply(1:3, function(x) df[, -x])

answered Jun 03 '20 at 18:29

akrun

874,273
37
540
662

R - dropping columns sequentially in for loop

EDIT:

3 Answers3