0

I'm currently cleaning some country based data. I have approximately 1000 entries and need to replace all country codes with full country names. An example of the codes are below:

"SL/L/N", "Sierra Leone", "L", "Lib/Nepal", "SL2/ Nepal", "SL2/L

My code converts all of the codes/countries correctly except one. The issue I have is that "L" stands for "Liberia" so needs substituting, but I can't differentiate between "L"s that are within a word e.g. "Sri Lanka" and that which stand for "Liberia". I tried using forward slashes as identifying features in the code below, but it returns for the "L" entries:

lut = c("Lib" = "Liberia", "Sri lanka" = "Sri Lanka", "WACC" = "West Africa", "W.Africa" = "West Africa", "SL2" = "Sri Lanka", "N" = "Nepal", "SL" = "Sierra Leone", "/L" = "/Liberia", "/L/" = "/Liberia/", "/L" = "/Liberia")
countryData$Country <- lut[countryData$Country]

Any help in turning the correct "L"s into "Liberia" but leaving "Sri Lanka" and "Sierra Leone" untouched is gratefully received.

pogibas
  • 27,303
  • 19
  • 84
  • 117
  • Use the regex `\\bL\\b` as in `sub("\\bL\\b", "Liberia", "SL/L/N")`. – Rui Barradas Sep 26 '17 at 18:28
  • Can also use the anchors `^` and `$` to match to start and end of string respectively in your regex: `gsub("^L$", "Liberia" , your_strings)` – dshkol Sep 26 '17 at 18:34

0 Answers0