4

So I have a dataframe with an unspecified number of columns (all I know is there would be at least 4 columns).

What I want to do is, for every 4 columns, I want to delete columns 3 and 4.

So let's say my dataframe contains 12 columns. In that case, I want to delete columns 3, 4, 7, 8, 11, 12.

I know that I can delete a column every nth column like this:

df <- df[,seq(2,ncol(df),4)]

But how can I delete every 3rd and 4th column in the same way using R?

Thank you.

Jstation
  • 407
  • 4
  • 14
  • Related: [Deleting Every nth Column from a Dataframe in R](https://stackoverflow.com/questions/61327783/deleting-every-nth-column-from-a-dataframe-in-r) – Ian Campbell May 09 '21 at 18:44

2 Answers2

5

As these are index, use - to remove those columns

i1 <- rep(seq(3, ncol(df), 4) , each = 2) + 0:1
df[,-i1]

Or another option is to use a logical index to recycle

df[!c(FALSE, FALSE, TRUE, TRUE)]

data

set.seed(24)
df <- as.data.frame(matrix(rnorm(12 * 4), 4, 12))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • I had to edit it to i1 <- rep(seq(3, ncol(df), 4) , each = 2) + 0:1 (changing the first 2 to a 3) but after that it worked perfectly, thank you. – Jstation May 07 '21 at 17:00
3

Update: Now after some practice and with modification of akrun's code: Here is what I have learned by doing:

library(dplyr)
df %>% 
  select(rep(seq(1, ncol(df), 4) , each = 2) + 0:1)

Output:

          V1         V2         V5         V6         V9        V10
1  0.3351943  0.7696819  0.5713866  1.3496121 -0.5712432  0.3612125
2 -0.2318646  1.7709054 -1.2799872 -1.5676166  0.4226218  1.0568642
3  0.5266526 -0.1961822 -1.2388796  0.1437999 -1.6733858 -1.9929205
4 -1.0736261  0.2047497 -0.9225911 -0.8861100 -1.1360259  0.7643851

First answer: Thanks to akrun for the data: With this data I tried what I am able to do: First long_format, then group with length of 4 (1,1,1,1,2,2,2,2 etc...), then slice the first two of each group, then back to wide format. I know it is awkward, but it should work.

set.seed(24)
df <- as.data.frame(matrix(rnorm(12 * 4), 4, 12))

library(tidyverse)
library(tidyr)
df <- df %>%
  pivot_longer(
    cols = everything(),
    names_to = "names",
    values_to = "values"
  ) %>% 
  mutate(Col2 = rep(row_number(), each=4, length.out = n())) %>% 
  group_by(Col2) %>% 
  slice_head(n = 2) %>% 
  ungroup()


df1 <- df %>%
  select(-Col2) %>% 
  pivot_wider(
    names_from = names,
    values_from = values
  ) %>% 
  unnest()

Output:

      V1     V2     V5      V6     V9    V10
   <dbl>  <dbl>  <dbl>   <dbl>  <dbl>  <dbl>
1 -0.546  0.847 -0.335 -0.0743  0.613 -4.47 
2  0.537  0.266  1.54  -0.605   1.52   0.369
3  0.420  0.445  0.610 -1.71    0.657  0.169
4 -0.584 -0.466  0.516 -0.269  -1.07  -1.82 
TarJae
  • 72,363
  • 6
  • 19
  • 66