-2

I have a column in a dataframe that looks like this:

STATUTE
961.41(1)(A)
961.41(1)(B)
961.41(1)(A)
961.41(1)(A)
961.41(1)(C)

I'm trying to use GREP to identify any record in "STATUTE" column that contains the string '961.41(1)(A)' and make a description in a new column if that string is found:

my_data$STATUTE_DESCR[grepl('961.41(1)(A)',my_data$STATUTE, ignore.case = TRUE)] <- "DELIVER/MANUF CONTROLLED SUBSTANCES"

Desired output:

STATUTE           STATUTE_DESCR
961.41(1)(A)      DELIVER/MANUF CONTROLLED SUBSTANCES
961.41(1)(B)
961.41(1)(A)      DELIVER/MANUF CONTROLLED SUBSTANCES
961.41(1)(A)      DELIVER/MANUF CONTROLLED SUBSTANCES
961.41(1)(A)4     DELIVER/MANUF CONTROLLED SUBSTANCES
961.41(1)(A)(B)   DELIVER/MANUF CONTROLLED SUBSTANCES
961.41(1)(C)

But, the grep statement is not working with the parenthesis. Can anyone advise?

oguz ismail
  • 1
  • 16
  • 47
  • 69
DiamondJoe12
  • 1,879
  • 7
  • 33
  • 81
  • you need `fixed = TRUE` and not `ignore.case` ie `my_data$STATUTE_DESCR[grepl('961.41(1)(A)',my_data$STATUTE, fixed = TRUE)] <- "DELIVER/MANUF CONTROLLED SUBSTANCES"` – Onyambu Oct 21 '22 at 17:48
  • `grepl('961\\.41\\(\\d+\\)\\(A\\)', '961.41(1)(A)') [1] TRUE`, escapting (for more generalized cases), but fixed=TRUE here, as above. – Chris Oct 21 '22 at 17:50

1 Answers1

-1

You have to scape the metacharacter ( using \\. Use '961.41\\(1\\)\\(A\\)' as regex or set fixed = TRUE inside grepl

> my_data$STATUTE_DESCR[grepl('961.41\\(1\\)\\(A\\)',my_data$STATUTE, ignore.case = TRUE)] <- "DELIVER/MANUF CONTROLLED SUBSTANCES"
> my_data
       STATUTE                       STATUTE_DESCR
2 961.41(1)(A) DELIVER/MANUF CONTROLLED SUBSTANCES
3 961.41(1)(B)                                <NA>
4 961.41(1)(A) DELIVER/MANUF CONTROLLED SUBSTANCES
5 961.41(1)(A) DELIVER/MANUF CONTROLLED SUBSTANCES
6 961.41(1)(C)                                <NA>
> 
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138