0

I am parsing a string into XML with powershell:

NODE IP 0.0.0.0 "APXPRD"

And I need to got:

<NODE_IP>0.0.0.0 "APXPRD"</NODE_IP>

I try to use regexp, but cant catch how to replace ' ' to '_' only if it between words in all capital letters, any advice?

I Try like that regexp:

$textis = 'NODE IP 0.0.0.0  "APXPRD"'
$textnew = $textis.replace('/^\s*[A-Z]+(?:\s+[A-Z]+)/m', '_')

but that seems not work :/

Gorodeckij Dimitrij
  • 4,209
  • 2
  • 17
  • 21
  • 2
    Please provide more context. There's probably a better solution to the overall problem than fixing a regular expression for this particular case. – Ansgar Wiechers Jan 12 '17 at 17:51

2 Answers2

1

You can search for:

((?:^|\s)[A-Z]+)\s([A-Z]+(?:\s|$))

And replace it with:

$1_$2

This will look for a string with uppercase letters ([A-Z]) which are directly after a whitespace (\s) or the beginning of the string (^). Then, a whitespace is necessary in between and at the end, it matches again for a whitespace (\s) or the string end ($).


If you are using PowerShell to replace it, you need to do it like this (take care about this: the command of case-sensitive matching is creplace not just replace):

$textis = 'NODE IP 0.0.0.0 TEST String "APXPRD"'
$textnew = $textis -creplace '((?:^|\s)[A-Z]+)\s([A-Z]+(?:\s|$))','$1_$2'
ssc-hrep3
  • 15,024
  • 7
  • 48
  • 87
  • What do you mean? – ssc-hrep3 Jan 12 '17 at 18:01
  • $textnew = $textis.replace('[A-Z]+)\s([A-Z]+', '$1_$2') not work :/ – Gorodeckij Dimitrij Jan 12 '17 at 18:03
  • You are missing the brackets... `$textnew = $textis.replace('([A-Z]+)\s([A-Z]+)', '$1_$2')` And maybe you need to escape the \ in `\s` too. So it would be `$textnew = $textis.replace('([A-Z]+)\\s([A-Z]+)', '$1_$2')` – ssc-hrep3 Jan 12 '17 at 18:05
  • thanks you are right about brackets, but still not working for me for some reasons, cant understand why. PS C:\Windows\system32> $textis = 'NODE IP 0.0.0.0 "APXPRD"' PS C:\Windows\system32> $textnew = $textis.replace('(?<=[A-Z]) (?=[A-Z])', '$1_$2') PS C:\Windows\system32> $textnew NODE IP 0.0.0.0 "APXPRD" PS C:\Windows\system32> $textnew = $textis.replace('([A-Z]+)\\s([A-Z]+)', '$1_$2') PS C:\Windows\system32> $textnew NODE IP 0.0.0.0 "APXPRD" – Gorodeckij Dimitrij Jan 13 '17 at 11:03
  • @GorodeckijDimitrij If you are using PowerShell, you can use the `-replace` command. See the updated answer. – ssc-hrep3 Jan 13 '17 at 12:47
  • Could you plese help me a little with that - if its a line like "SEVERITY Warning" its also became "SEVERITY_Warning" but i want to avoid that, it should replace space only if second word all leters are capital. Is that posible? – Gorodeckij Dimitrij Jan 18 '17 at 11:30
  • You are right, that will match too, I'll update my answer. – ssc-hrep3 Jan 18 '17 at 11:47
  • Thanks a lot! Just got another case - hope you can help there as well, is that possible to replace space in that one: CONDITION TEXT SET? as for now its replace only CONDITION_TEXT SET Thanks for you help, that regexp magic make my crazy :) – Gorodeckij Dimitrij Jan 18 '17 at 15:27
  • It is quite easy to write an expression for three words, but rather difficult if your trying to match `n` words (see [here](http://stackoverflow.com/a/6939587/3233827)). So, you would need a different regular expression for every number of words you want to match. If you only want to match 3 words, you would replace it with `$1_$2_$3`, which won't work anymore for two words (it would lead to `CONDITION__TEXT`). So, you would need to multiply replace it in decreasing number of words with a different pattern each. – ssc-hrep3 Jan 18 '17 at 17:33
  • ah, I was not understand $1_#2 stuff properly (beginner in powershell), now I got it! Thanks a lot again! – Gorodeckij Dimitrij Jan 19 '17 at 10:50
1
(?<=[A-Z]) (?=[A-Z])

Will get you the spaces between capital letters.

Note the space in the middle. It's using lookbehind and lookahead

Andrew Magerman
  • 1,394
  • 1
  • 13
  • 23