How to select a string depending on a prefix and a suffix, but not them

Question

I've a collection of strings like that (each "space" is a tabulation):

29  301 3   31  0       TREZILIDE       Trézilidé
2A  001 1   73  1   (LE)    AFA (Le)    Afa

What I want is to transform it into this:

29301 Trézilidé
2A001 (Le) Afa

Suppression of the first tabulation
suppression of the tabulations, numbers and the first uppercase occurrence (and replacement of the whole stuff by a space)
replacement of the last tabulation by a space

My bigger problems are:

How to select the first tabulation without selecting the "prefix" and the "suffix"? (like ^(..)\t[0-9] but without selecting ^(..) nor [0-9])
How to select from after the 3 digits to after the tabulation of the uppercase word?

I do that in a text file with the search and replace toolbox of Notepad++

Thanks in advance for your help!

BoltClock · Accepted Answer · 2012-04-27T09:18:57.433

6

How to select the first tabulation without selecting the "prefix" and the "suffix"?

Optimally this is done using lookahead and lookbehind assertions, but Notepad++ doesn't support those before version 6.0. The next best solution is to just capture them, then backreference them in the replacement string.

Here's how I did it (in answer to your full question):

Check Match case to do a case-sensitive find
Find by regex:
```
^(..)\t(\d\d\d)[\tA-Z0-9()]+\t(.+)$
```
Replace with:
```
\1\2 \3
```
I end up with this, where <tab> represents an actual tabulation:
```
29301 Trézilidé
2A001 (Le)<tab>Afa
```
To get rid of that I do an extended find:
```
\t
```
And replace it with the space character, to obtain the final result:
```
29301 Trézilidé
2A001 (Le) Afa
```

edited Apr 27 '12 at 09:18

answered Jan 21 '11 at 12:55

BoltClock

700,868
160
1,392
1,356

+1 I was running an older version of notepad++ and was wondering why I couldn't use assertions. Thankfully this post was one of the first that came up! – Malachi Apr 27 '12 at 09:09
@Malachi: I haven't been able to come up with a solution to this particular question using assertions. But it's always good to know it finally supports them now :) – BoltClock Apr 27 '12 at 09:18
@BoltClock'saUnicorn: It looks like lookbehinds are not supported :( Only forward assertions. – Malachi Apr 27 '12 at 10:10
@Malachi: Interesting. Maybe more complex ones aren't supported. I can match `abc` in `; abc` using `(?<=;\s).*` but not when I use `(?<=;\s*).*` – BoltClock Apr 27 '12 at 10:12
@BoltClock'saUnicorn: I think it could be. I'm no regular expression expert but `(>"(?!\s+)).+((?<!\s+)"<)` This works in eclipse but not in Notepad++ – Malachi Apr 27 '12 at 10:51

score 1 · Answer 2 · answered Jan 21 '11 at 12:59

1

Try

^(..)\t

Replace with

\1

Then

\(*[A-Z][A-Z]+\)*

Replace with empty string, removes (LE) and AFA too.

''

Then

^(.....).*(\t[A-Za-z]+)+$

Replacement:

\1 \2

And finally:

\t

Replace with a space. Every occurence.

HTW

answered Jan 21 '11 at 12:59

Zsolt Botykai

50,406
14
85
110

Thank you, you have helped me ^^ – Pascal Qyy Jan 21 '11 at 19:43

How to select a string depending on a prefix and a suffix, but not them

2 Answers2

Linked