1

Using Notepad++, I would like to replace all occurrences of a semi-colon followed by a space ("; ") with a semi-colon (";") in a string like this

S#14_Budget.Y#2014; 2015; 2016.P{[Third Generation]}.w#Periodic.E{PRU.[Descendants]}; CNS.PRU.V#[None]; Total.A{Asset.[Base]}; {Liab.[Base]}; {NProfit.[Base]}; {Bal.[Base]}; {Stats.[Base]}; {BalSht.[Base]}; {NIncome.[Base]}.I{[ICP Top].[Base]}.C1#TotalC1.C2#TotalC2.C3#[None].C4#[None]</ns

The string can occur hundreds of times in one file, and I find it using S#.*</ns.
The string always begins with S# and always ends with </ns.
All the bits in the middle can vary, including the number of spaces.
Only the "; " needs to be changed to ";".

Björn Lindqvist
  • 19,221
  • 20
  • 87
  • 122
jhw
  • 35
  • 1
  • 7
  • ["Notepad++ seems to not have implemented variable-length look-behinds"](http://stackoverflow.com/questions/17286667/error-in-regular-expression-in-notepad-using-negative-lookbehind) (at least not when that was written, according to random-stranger-on-the-internet) - this might make this rather difficult / impossible (otherwise replacing `(?<=S#14.*?); (?=.*? – Bernhard Barker Sep 18 '13 at 22:17
  • @Dukeling We could use `S#14.*?\K; (?=.*? – HamZa Sep 18 '13 at 22:27
  • @HamZa, `S#14.*?\K; (?=.*? – jhw Sep 18 '13 at 22:52
  • 1
    @HamZa Since there is a `Replace All`, you can just hit that as many times as the string containing the most `; `'s, if this isn't too great a number, but a macro isn't too difficult either. It doesn't want to replace with `\K`, but you can use grouping, posted an answer. – Bernhard Barker Sep 18 '13 at 22:53

1 Answers1

2

With a bit of help from HamZa, I came up with this:

Replace:

(S#.*?;) (?=.*?</ns)

with:

\1

Then just hit Replace All until no replacements are made any more (each find includes the S#, so you can only do one replacement for each of these strings at a time), or you can write a simple macro to find and replace (all?) and run that as many times as required.

If this string of yours is on its own line, you should also include start (^) and end ($) of line indicators:

^(S#.*?;) (?=.*?</ns$)

Explained: (source)

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    S#                       'S#'
--------------------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
--------------------------------------------------------------------------------
    ;                        ';'
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
                           ' '
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
--------------------------------------------------------------------------------
    </ns                     '</ns'
--------------------------------------------------------------------------------
  )                        end of look-ahead

If Notepad++ supported variable-length look-behinds (at least it doesn't in 6.4.5), you could've replaced (?<=S#14.*?); (?=.*?</ns) with ; (and just have done one Replace All).

Community
  • 1
  • 1
Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138