In a dataframe, one column includes a GS1 code scanned from barcodes. A GS1 code is a string including different types of information. Application Identifiers (AI) indicate what type of information the next part of the string is. Here is an example of a GS1 string: (01)8714729797579(17)210601(10)23919374 the AI is indicated between brackets. In this case (01) means 'GTIN', (17) means 'Expiration Date' and (10) means 'LOT'. What I like to do in R is create three different columns from the single column, using the AI as the new column names.
I tried using 'separate', but the brackets aren't removed. Why aren't the brackets removed?
df <- data.frame(id =c(1, 2, 3), CODECONTENT = c("(01)871(17)21(10)2391", "(01)579(17)26(10)9374", "(01)979(17)20(10)9193"))
df <- df %>% separate(CODECONTENT, c("GTIN", "Expiration_Date"), "(17)", extra = "merge") %>%
separate(Expiration_Date, c("Expiration Date", "LOT"), "(10)", extra = "merge")
The above returns the following:
id | GTIN | Expiration Date | LOT | |
---|---|---|---|---|
1 | 1 | (01)871( | )21( | )2391 |
2 | 2 | (01)579( | )26( | )9374 |
3 | 3 | (01)979( | )20( | )9193 |
I am not sure why the brackets are still there. Besides removing the bracket would there be a smarter way to also remove the first AI (01) in the same code?