-2

I have some strings like "Dave&#39 s Market" or "C&#39 est la vie" I would like to convert to "Daves Market" and "Cest la vie" respectively. I know it is something like '[&#39]+' but I cannot get the optional " s" to be just "s".

bill rowe
  • 43
  • 5
  • 1
    Possible duplicate of [Remove html entities and extract text content using regex](https://stackoverflow.com/questions/26127775/remove-html-entities-and-extract-text-content-using-regex) – Vasan May 16 '18 at 17:10
  • 1
    Try `[0-9]+\s+` – JohnyL May 16 '18 at 17:11
  • @Vasan May not be, since in this case the OP may want to convert the text instead of remove; i.e. actually want to have "Dave's Market" instead of "Daves Market" – cst1992 May 18 '18 at 09:10

2 Answers2

1

The regex substitution s/&#39 //g should work, see this demo.

L3viathan
  • 26,748
  • 2
  • 58
  • 81
  • Maybe replace the hard-coded `39` to a generic `[\d]+`? – CinCout May 18 '18 at 09:16
  • I don't see that requirement in the question. I personally would think one would want to decode them instead of removing them, but if the OP wants ' removed, I'll remove ' and nothing else. – L3viathan May 18 '18 at 10:00
0

What you'd rather want is something like replacing all those escaped literals with what they should actually be, e.g.:

Dave's -> Dave's
Me & Her -> Me & Her

Then you'll have to use some kind of replacement code and regex.

An example(in JavaScript):

var m = new Map();
m.set("'", "'");
m.set("&", "&");
// and so on

m.forEach(function(value, key) {
    // text contains your text
    text = text.replace(new RegExp(key), value);
}
cst1992
  • 3,823
  • 1
  • 29
  • 40