How can I iterate the levels of a factor in R?

Question

I would like to create a function that helps me to identify possible mistakes in the levels of a factor by accessing the first letter, so first I am focused on the identification part.

Data Frame '''

alleles<-(c('A*24:02', 'A*11:01', 'blank',  'A*31:01'))
as.factor(alleles)
freq<-c(0.3782, 0.4209, 0.0362, 0.0761)

df<-data.frame(alleles, freq)

'''

My attempt_ '''

for(i in df$alleles){
  if (i != 'A'){
    can<-c()
    append(can, i)
    df$alleles<-df$alleles[-c(can)]
  }
}

''' Error message Error in -c(can) : invalid argument to unary operator

Observations If I do '''print(can)''' the output is "NULL" meaning that it is not working the use of "append".

Please provide your expected result. It's a little difficult to understand what your code is supposed to do. If you just want the first character in `alleles`, you can use `substr(df$alleles, 1, 1)`. — andrew_reece, Nov 21 '20 at 19:07

score 1 · Accepted Answer · answered Nov 21 '20 at 19:16

1

You can also try:

#Data
alleles<-(c('A*24:02', 'A*11:01', 'blank',  'A*31:01'))
freq<-c(0.3782, 0.4209, 0.0362, 0.0761)
df<-data.frame(alleles, freq)
can<-c()
#Check
for(i in 1:length(df$alleles))
{
  if (substr(df$alleles[i],1,1) != 'A'){
    can <- c(can, as.character(df$alleles[i]))
  }
}
#Apply
df<-df[-which(df$alleles %in% can),]

Output:

df
  alleles   freq
1 A*24:02 0.3782
2 A*11:01 0.4209
4 A*31:01 0.0761

answered Nov 21 '20 at 19:16

Duck

39,058
13
42
84

It's incredible, thank you so much! I was wondering if you can tell me how to find the documentation of '%in%. I have typed '?%in%' but it doesn't work – Christ14n97 Nov 21 '20 at 19:54
1

@Christ14n97 Try ? in – Duck Nov 21 '20 at 20:33
1

@Christ14n97 Also check this post about that operator https://stackoverflow.com/questions/12730629/what-do-the-op-operators-in-mean-for-example-in – Duck Nov 21 '20 at 20:37
1

I appreciate so much your help! – Christ14n97 Nov 22 '20 at 11:04

iod · Answer 2 · 2020-11-21T20:05:40.270

0

Why not just use a regular expressions?

df[grepl("^A", df$alleles),]

edited Nov 21 '20 at 20:05

answered Nov 21 '20 at 19:12

iod

7,412
2
17
36

score 0 · Answer 3 · answered Nov 21 '20 at 21:36

0

We can use grep

df[grep("^A", df$alleles),]

answered Nov 21 '20 at 21:36

akrun

874,273
37
540
662

How can I iterate the levels of a factor in R?

3 Answers3