1

I am completely clueless on how to use regex and need some help on the problem above. I need to replace <> with new lines but keep the string between <>. So

<'sample text'><'sample text 2'>

becomes

'sample text'
'sample text2'
mklement0
  • 382,024
  • 64
  • 607
  • 775
Adlis
  • 95
  • 1
  • 11
  • 1
    What is the language you're using? Also, do you need new line for first `<` and last `>` ? – Niitaku Feb 02 '17 at 19:04
  • Just replace `><` with a newline, no regular expression needed. – Barmar Feb 02 '17 at 19:04
  • The general answer about how to keep parts of a string when doing a regular expression replacement of other parts is to use a capture group for the parts you want to keep, and back-references in the replacement. – Barmar Feb 02 '17 at 19:05
  • how should look the result for this input `<'sample text'> <'sample text 2'> some text <'sample text 3'> <>` ? – RomanPerekhrest Feb 02 '17 at 19:12
  • For niitaku its using powershell and the first and last group of <> shouldn't have new lines. For barmar i will try and see if it works with the text file i am using. RomanPerekhrest it should look like 'sample text' *new line*'sample text 2' some text *new line*'sample text 3' *new line*"empty line" – Adlis Feb 02 '17 at 19:19
  • @Adlis, there's a contradiction between your first expected result `'sample text' \n 'sample text2'` and the last one `sample text' 'sample text 2' some text 'sample text 3' "empty line"` – RomanPerekhrest Feb 02 '17 at 19:24
  • woops i missed that should be sample text \n sample text 2 \n some text sample text 3 \n empty \n. – Adlis Feb 02 '17 at 19:37

3 Answers3

2
\<([^>]*)\>

This regex will capture the text between < and > into a capture groups, which you can then reference again and put a newline between them.

\1\n

Check it out here.

EDIT:

In PowerShell

PS C:\Users\shtabriz> $string = "<'sample text'><'sample text 2'>"
PS C:\Users\shtabriz> $regex = "\<([^>]*)\>"
PS C:\Users\shtabriz> [regex]::Replace($string, $regex, '$1'+"`n")
'sample text'
'sample text 2'
Shawn Tabrizi
  • 12,206
  • 1
  • 38
  • 69
0

This works for me in Textpad:

Example:

String:

" 1) Navigate to record. 2) Navigate to the tab and select. 3) Click the field. 4) Click on the tab and scroll."

Note: For search/replace blow, do NOT include the quotes, I used them to show the presence of a space in the search term

Search: "[0-9]+) " Replace: "\n$0"

Resulting String:

  1. Navigate to record.
  2. Navigate to the tab and select.
  3. Click the field.
  4. Click on the tab and scroll.

(note... stackoverflow changed my ")" to a ".")

0

To complement Shawn Tabrizi's helpful answer with a more PowerShell-idiomatic solution and some background information:

PowerShell surfaces the functionality of the .NET System.Text.RegularExpressions.Regex.Replace() method ([regex]::Replace(), from PowerShell) via its own -replace operator.

The most concise solution (but see below for potential pitfalls):

# Note the escaped "$" ("`$")
"<'sample text'><'sample text 2'>" -replace '<(.*?)>', "`$1`n"

Output:

'sample text'
'sample text 2'
  • $1 is a numbered capture-group substitution, referring to what the 1st (and only) capture group inside the regex ((...)) captured, which are the strings between < and > (.*? is a non-greedy expression that matches any run of characters but stops once the next construct, > in this case, is found).

    • However, inside a double-quoted string ("..."), also known as an expandable string, $1 would be interpreted as a PowerShell variable reference, so the $ character must be escaped in order to be preserved, using the backtick (`), PowerShell's general escape character: "`$1"

    • Conversely, if you want the .NET API not to interpret a $ character in the substitution string, use $$ (either $$ inside '...', or "`$`$" inside "...") - but note that inside the regex operand a verbatim $ must be escaped as \$.

  • "`n" is a PowerShell escape sequence that can be used inside expandable strings (only) - see the conceptual about_Special_Characters help topic.

Caveat:

  • While convenient here, there are pitfalls with respect to using expandable strings as the regexes and substitution operands, as it isn't always obvious what PowerShell expands (interpolates) up front, and what the .NET API ends up seeing as a result.

  • Therefore, it is generally preferable to use single-quoted strings ('...', also known as verbatim strings) - both for the substitution operand and the regex itself, and - if needed - use an expression ((...)) to build the overall string, which allows you to separate the verbatim (pass-through) parts from interpolated parts.

This is what Shawn did in his answer; translated to a -replace operation:

# Note the expression used to build the substitution string
# from a verbatim ('...') and an interpolated ("...") part.
"<'sample text'><'sample text 2'>" -replace '<(.*?)>', ('${1}' + "`n")

Another option, using -f, the format operator:

"<'sample text'><'sample text 2'>" -replace '<(.*?)>', ("{0}`n" -f '${1}')

Note the use of ${1} instead of just $1: Enclosing the number / name of the referenced capture group in {...} disambiguates it from the characters that follow, which avoids another pitfall, as the following example shows (incidentally, PowerShell's own variable references can be disambiguated the same way):

# FAILS and results in 'f$142', because the .NET API sees
# '$142' as the substitution string, and there is no 142nd capture group.
$suffix = '42'; 'foo' -replace '(oo)', ('$1' + $suffix)

# OK, with disambiguation via {...} -> 'foo42'
$suffix = '42'; 'foo' -replace '(oo)', ('${1}' + $suffix)
mklement0
  • 382,024
  • 64
  • 607
  • 775