-3

Please consider the following examples.

The next line is **

_This line surrounded with emphasis mark_

`hey this is crazy`

**bold**

Now, I want to figure out a regex that identifies the special characters. Basically I want to use string.replaceAll(regex,"")so that I can replace only these special characters **,_,` from the string. Consider each line to be 1 string.

I can identify that each special character is preceded by either space or a new line, followed by the a string, followed by the special character I am trying to remove.

Also please explain the regex.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Nayanjyoti Deka
  • 419
  • 3
  • 15
  • 3
    Please show the code you have tried so far. – Wiktor Stribiżew May 22 '16 at 14:29
  • @WiktorStribiżew Currently, the regex that I thought is like this. `^(\\s|\\n)(\\**|_|`) ` – Nayanjyoti Deka May 22 '16 at 14:30
  • Possible duplicate of [Learning Regular Expressions](http://stackoverflow.com/questions/4736/learning-regular-expressions) – Biffen May 22 '16 at 14:32
  • 3
    Well, sorry, it is either rather broad or unclear. Removing symbols is as easy as using `.replaceAll("[_*\`]+","")`. BTW, `\s` matches `\n`. – Wiktor Stribiżew May 22 '16 at 14:36
  • @WiktorStribiżew I want to replace only when it is preceded by space/new line. Basically not when it is part of an word. eg. `unicode_snob` Now with the above regex, it will replace the underscore here as well. – Nayanjyoti Deka May 22 '16 at 14:40
  • Like `.replaceAll("(?<!\\S)[_*\`]++|[_*\`]++(?!\\S)","")`? See, without exact requirements it is difficult to help. – Wiktor Stribiżew May 22 '16 at 14:41
  • @WiktorStribiżew slightly closer, also the regex has to replace only when there are 2 asterisk and not one. the string might have 1 asterisk which is valid case. also i think this won't remove the trailing special characters. Or maybe need to do another regex pass to remove trailing ones? – Nayanjyoti Deka May 22 '16 at 14:45
  • Ok, I can only suggest `"(?<!\\S)(\\*{2}|[_\`])|(\\*{2}|[_\`])(?!\\S)"`. – Wiktor Stribiżew May 22 '16 at 14:47
  • @WiktorStribiżew Bingo!!! Thanks a lot!! Could you please explain as well how you figured it ? `(?<!\\S)` this looks for space at beginning. `(\\*{2}|[_`])` this is where you say asterisk to be twice and underscore and tilde char. I am unable to understand why this `|(\\*{2}|[_`])` after that? The last `(?!\\S)` is again for space i believe. Could you please explain to me how it works? That would be really helpful. – Nayanjyoti Deka May 22 '16 at 14:52

1 Answers1

1

You may use

"(?<!\\S)(\\*{2}|[_`])|(\\*{2}|[_`])(?!\\S)"

See the regex demo

The regex matches any **, or _ or ` (with (\*{2}|[_`])) NOT preceded with a non-whitespace symbol (see (?<!\\S)), or any **, _, or ` that is not followed by a non-whitespace symbol.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563