0

I have a large dataset that i am trying to subset by selecting columns based on an arithmetic progression. My dataset has 370 columns. I want to remove 6 columns every 18 columns. What I did was

a=seq(from=5, to =365, by=18)
# num [1:21] 5 23 41 59 77 95 113 131 149 167 ...

and

b=seq(from=10, to =370, by=18)

to find the numbers of columns I need to remove. I essentially need to remove columns: -[a:b], meaning [c(-5:-10,-(5+1*18):-(5+1*18),-(5+2*18):-(5+2*18),etc)

I tried to create a for loop to do that as follows:

for(i in 1:21) {temp <- subset(set, select = -c( a[i]:b[i]))}

# Error in a[i]:b[i] : NA/NaN argument

but it doesn't work because I get this error!

mnel
  • 113,303
  • 27
  • 265
  • 254
user3495945
  • 141
  • 1
  • 8

3 Answers3

3

Please read

Why is `[` better than `subset`?

to understand why subset is not appropriate here.

set[,-unlist(Map(":",a,b))]

Will return what you want.

Community
  • 1
  • 1
mnel
  • 113,303
  • 27
  • 265
  • 254
  • Thanks for your response. I am familiar with set and unlist, so I am assuming you set a vector that includes the unlisted contents of the parenthesis as columns. But I am not sure what (Map) refers to. Can you help?I couldnt find this online? – user3495945 Jul 18 '14 at 03:44
  • @user3495945 `Map` is a wrapper doing the equivalent of `mapply(...,SIMPLIFY=FALSE)` (as you would have found by typing `Map` or `?Map` at the R console. – mnel Jul 18 '14 at 04:00
  • wow!The things i don't know!Thanks. I understand that mapply is the name for a loop that R utilizes. I am wondering how the code would be if one attempted to write it in a ''for loop'' format. – user3495945 Jul 20 '14 at 04:39
1

Not sure exactly what "remove 6 columns every 18 columns" actually means, but here is one interpretation, i.e., remove the last 6 columns within groups of 18:

smlset <- set[  , c( rep(TRUE, 12), rep(FALSE, 6) ) ]

If you wanted the 5th to the 10 columns in groups of 18 removed, it would be:

smlset <- set[  , c( rep(TRUE, 4), rep(FALSE, 6), rep(TRUE,8) ) ]
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Thanks Bonded Dust and mnel!I was not able to understand akrun's approach. Since I am a noobie can you tell me which package includes the smlset function?ALso can you tell me where to find more info on Map? – user3495945 Jul 18 '14 at 03:40
  • `smlset` was just the name I made up to be the reference to the returned object. I was trying to make it clear that it was a proper subset of the `set` object. (Probably poor practice to use "set" as an object name since there are "set" functions.) The only functions being used are `<-`, `[`, `c`, and `rep`. – IRTFM Jul 18 '14 at 03:47
0

You may also try

set.seed(42)
set <- matrix(sample(25, 370*5,replace=TRUE), ncol=370, dimnames=list(NULL,1:370))
set[,-sort(5+(0:trunc(370/18))*18 +rep(0:5, each=ceiling(370/18)))]
akrun
  • 874,273
  • 37
  • 540
  • 662