0

So I have a column in a data frame filled with species names and I want to remove certain occasional ambiguous characters, I used this:

df$species_name<-gsub(" sp.", "", df$species_name)
df$species_name<-gsub(" sp. nov", "", df$species_name)
df$species_name<-gsub(" cf.", "", df$species_name)
df$species_name<-gsub(" complex.", "", df$species_name)
df$species_name<-gsub(" cmplx.", "", df$species_name)
df$species_name<-gsub(" pr.", "", df$species_name)
df$species_name<-gsub(" f.", "", df$species_name)
df$species_name<-gsub(" nr.", "", df$species_name)
df$species_name<-gsub(" s.l.", "", df$species_name)
df$species_name<-gsub(" grp.", "", df$species_name)

So those are the expressions that I want to remove but I think the "." are being the source of confusion (even though I want to remove the . as well) because from what I gather, if a species is named like "Tanytarsus fanderseni", it is removed from the data frame, and I'm suspecting it's because of the gsub("f.",df$species_name), but I don't know how to solve it. Thanks in advance for any answers

tadeufontes
  • 443
  • 1
  • 3
  • 12
  • 2
    `.` has a special meaning in regex. You can escape the `.` with backslash. `gsub(" sp\\.", "", df$species_name)` Or probably better to use `fixed = TRUE` like `gsub(" sp.", "", df$species_name, fixed = TRUE)`. – Ronak Shah Jan 22 '20 at 10:51
  • just add `fixed = TRUE` in gsub – Sotos Jan 22 '20 at 10:52

0 Answers0