0

I have to translate some R code into python. I got stuck in the following line which I don't understand what happens there:

new_data <- data %>% select(-contains('exit'),exit)

what is -contains for string? as for the select - I understand that the second exit is regarding a specific column named exit, but what does it mean the -contains? that the exit column will not contain "exit" string?

Thanks

MLavoie
  • 9,671
  • 41
  • 36
  • 56

1 Answers1

0

It means that remove the column if it has string 'exit' in it. Consider this example,

df <- data.frame(a = 11:15, exit = 1:5, exit1 = letters[1:5], exit2 = LETTERS[1:5])
df
#   a exit exit1 exit2
#1 11    1     a     A
#2 12    2     b     B
#3 13    3     c     C
#4 14    4     d     D
#5 15    5     e     E

If you do

df %>% select(contains('exit')) 

#  exit exit1 exit2
#1    1     a     A
#2    2     b     B
#3    3     c     C
#4    4     d     D
#5    5     e     E

Here, it selects column which has exit in it when we add minus (-) sign to it, it removes those columns

df %>% select(-contains('exit'))

#   a
#1 11
#2 12
#3 13
#4 14
#5 15

In your case it says that remove all the column which has exit in it but keep column with name exit, rest of them are stayed as it is

df %>% select(-contains('exit'),exit)

#   a exit
#1 11    1
#2 12    2
#3 13    3
#4 14    4
#5 15    5
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213