-3

I wish to scan a document for all instances of known numbers. I then wish to replace each occurrence of those numbers with a string. There are 5 different numbers represented in the data and I wish to convert these numbers into a corresponding alphanumeric string.

  • Every time a number is found in the data it is preceded by "I/N:" (eg.. I/N:AB1243)
  • The numbers are 6 char alphanumeric.(eg AB1243)
  • There are 5 different numbers represented (example:A86501, B86502, C86503, A89777 and B89778)
  • I want to replace each number with a predetermined string. (eg.. replace all instances of A86501 with Str1, all instances of B86502 with Str2, C86503 with Str3... and so on)

An example input and resultant desired output is as follows:

    string Str1= "YELLOW07"
    string Str2 = "BLUE82"
    string Str3 = "RED31"

Input data: NM:BLUEMEDIA000001LOC:NewYorkJFKI/N:A86501DT:07082021NM:JUNESWEEPSTAKESLOC:FargoI/N:B86502DT:10/08/2021

Desired output data: NM:BLUEMEDIA000001LOC:NewYorkJFKI/N:YELLOW07DT:07082021NM:JUNESWEEPSTAKESLOC:FargoI/N:BLUE82DT:10/08/2021

1 Answers1

1

One option would be to use a regular expression that has a matchevaluator. An ME is a bit of code that, in the context of replace, decides what replacement shall be applied

Putting find and replace in a dictionary:

var d = new Dictionary<string, string> {
  { "ABC" = "ZYX"},
  { "FOO" = "BAR" },
  { "A86501" = Str1 }, 
  ...
};

Making a regex from the keys:

var r = "string.Join("|", d.Keys) ; //regex of ABC|FOO|YELLOW07

Having a regex replacement that asks the dictionary for what to replace:

var doc2 = Regex.Replace(doc, r, m=>d[m.Value]);
  

Regex will search for ABC it FOO, when found it will load whatever is found into a Match.Value and call the delegate; m is the Match, so we use the Value (eg FOO) to lookup in the dictionary for its replacement (eg BAR) which is the return value from the delegate and is used to replace in the string result.

Because the regex is built from the dictionary we shouldn't ever try looking up anything that isn't already in the dictionary (and thus get a KeyNotFound), but beware if you put the regex in case insensitive mode, then you might get one if your dictionary has FOO and the regex finds foo. To deal with that you could add all your keys as uppercase and use ToUpper on the m.Value

Note, you don't have to have your dictionary built in code - this can come from a config file, database etc.

Caius Jard
  • 72,509
  • 5
  • 49
  • 80