0

Thank you so much for your help in advance.

I have a field named "ERROR_COLAB" in which a series of responses are concatenated into a single long string, because of the nature of the ERRORS that can be present there is no a formal, objective, efficient way to "split" the values in "ERROR_COLAB" to classify the responses concatenated in them.

So I was thinking about what if I can create a dataframe with the values that I need to extract to later on "parse" them into a regex formula in order to extract them.. to illustrate my idea:

Lets say I have this datedrame

code_error meaning
po_R83 No_call_bak
?OP card_nofunds
HOTELARCH78 overbookings

and I have the following values in "ERROR_COLAB"

ERROR_COLAB
?OP_ERR7+JSU8.OIJK1
po_R83_io
IOS_NEVER:300SSSS
HOTELARCH78?123-

I would like to know if the first part of the string is equal to any of the values on the field "error code" of the dataframe containing the code and meanings . So my desired result would look like this:

ERROR_COLAB code_error_matched meaning
?OP_ERR7+JSU8.OIJK1 ?OP card_nofunds
po_R83_io po_R83 No_call_bak
IOS_NEVER:300SSSS N.A N.A
HOTELARCH78?123- HOTELARCH78 overbookings

Thank you so much guys! like trully!

data:

codes<-tribble(~code_error, ~meaning,
"po_R83",   "No_call_bak",
"?OP",  "card_nofunds",
"HOTELARCH78",  "overbookings")

errors<-tribble(~ERROR,
"?OP_ERR7+JSU8.OIJK1",
"po_R83_io",
"IOS_NEVER:300SSSS",
"HOTELARCH78?123-")

R_Student
  • 624
  • 2
  • 14
  • 1
    Refer to https://stackoverflow.com/questions/26405895/how-can-i-match-fuzzy-match-strings-from-two-datasets – Peace Wang May 03 '21 at 18:54

1 Answers1

2

A base R option using agrep + merge

merge(
  transform(
    codes,
    ERROR = sapply(code_error, function(x) agrep(x, errors$ERROR, value = TRUE))
  ),
  errors,
  all = TRUE
)

gives

                ERROR  code_error      meaning
1 ?OP_ERR7+JSU8.OIJK1         ?OP card_nofunds
2    HOTELARCH78?123- HOTELARCH78 overbookings
3   IOS_NEVER:300SSSS        <NA>         <NA>
4           po_R83_io      po_R83  No_call_bak
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81