0

I doubt it's possible, but I haven't found anything to specifically say it's not possible. But is there some way to construct a parallel alternation in a search and replace regex? So, for example, if I wanted to replace street types with their abbreviations, could I do something like this:

s/(STREET|AVENUE|BOULEVARD)/(ST|AVE|BLVD)/ 

without having the entire rhs substituted in? Or do I really have to do separate replaces for each street type?

Eddie Rowe
  • 102
  • 2
  • 10
  • 1
    Language? You can do this in Perl and Python by calling a function. – dawg Sep 27 '16 at 19:46
  • 1
    What language are you using? Many languages allow you to use a function when replacing, and then it can provide different replacements depending on the matched string. E.g. PHP `preg_replace_callback()`. – Barmar Sep 27 '16 at 19:47
  • If you're doing this in a text editor, it's probably not possible. – Barmar Sep 27 '16 at 19:48
  • 1
    [It is possible in Notepad++.](http://stackoverflow.com/questions/37160927/how-to-use-conditionals-when-replacing-in-notepad-via-regex/37161309#37161309) – Wiktor Stribiżew Sep 27 '16 at 19:50
  • 1
    Could be done in dreamweaver too `(?:(ST)REET|(AVE)NUE|(B)OU(L)E(V)AR(D))`, `$1$2$3$4$5$6`. Knowing where would help this question a lot.. – chris85 Sep 27 '16 at 19:57
  • Doing this in SAS, so the fun stuff Perl allows is out. – Eddie Rowe Sep 28 '16 at 01:52

5 Answers5

3

This isn't that pretty, but it'll get the job done:

Replace

(?:(ST)REET|(AVE)NUE|(B)OU(L)E(V)AR(D))

with

\1\2\3\4\5\6

It matches the words, capturing the relevant parts. Replace with all capture groups and the relevant parts are inserted.

See it here at regex101.

SamWhan
  • 8,296
  • 1
  • 18
  • 45
  • So, can we up the ante on parallel replacements that aren't strictly abbreviations? - so that /(FIRST|SECOND|THIRD)/ could be replaced by 1ST|2ND|3RD – Eddie Rowe Sep 28 '16 at 15:13
  • 2
    Not without programming logic (to my knowledge), (Or like mentioned, Notepad++, and likes...) – SamWhan Sep 28 '16 at 15:30
3

For the fun, and for these three words only in PCRE/Perl/Python regex module/npp:

(?:\G(?!^)|\b(?=(?:STREET|AVENUE|BOULEVARD)\b))[A-Z]*?\K(?:TREE|E(?:NU)?|OU|AR)\B

replace with the empty string.

demo

or this one:

\G[A-Z]*?(?>\W*\b(?>\w+\W+)*?(?=(?:STREET|AVENUE|BOULEVARD)\b))?[A-Z]*?\K(?:TREE\B|E(?:NU)?\B|OU\B|AR\B)

demo

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
2

In Python, you would use a call back to a dictionary like so:

>>> abs={'STREET':'ST', 'AVENUE':'AVE','BOULEVARD':'BLVD'}
>>> re.sub(r'(STREET|AVENUE|BOULEVARD)', lambda m: abs[m.group(1)], 'Fourth STREET')
'Fourth ST'

In Perl, you can do:

use strict;
use warnings;

my %abs=(
    'STREET', 'ST',
    'AVENUE' ,'AVE',
    'BOULEVARD', 'BLVD'
);
$_='Fourth STREET';
s/(STREET)|(AVENUE)|(BOULEVARD)/$abs{$1}/ && print;
dawg
  • 98,345
  • 23
  • 131
  • 206
1

It depends on language or tool you are using. For example, using Notepad++, you can replace

(STREET)|(AVENUE)|(BOULEVARD)

with:

(?1ST)(?2AVE)(?3BLVD)
logi-kal
  • 7,107
  • 6
  • 31
  • 43
-1

Well, the first two substrings aren't too difficult:

import re

s = 'street'; a = 'avenue'; b = 'boulevard'

re.sub(r'(str)eet|(ave)nue|(boulevard)', r'\1 \2 \3', s)
re.sub(r'(str)eet|(ave)nue|(boulevard)', r'\1 \2 \3', a)
re.sub(r'(str)eet|(ave)nue|(boulevard)', r'\1 \2 \3', b)

The last three lines return matches plus white space for the groups that weren't matched. I think one may have to do further processing on the string in order to get 'blvd' from 'boulevard' were it to be captured by the above regex. That's reasonable though, since extracting a set of substrings from 'boulevard' is a separate issue from capturing and replacing one of a set of alternate regexes.

Perhaps, since this way already requires the extra step of removing whitespace, one could do something like this:

#with boulevard
new_str = re.sub(r'(str)eet|(ave)nue|(b)oulevard', r'\1 \2 \3lvd', b)
re.sub(r'\s+|\blvd', '', new_str)

#with avenue
new_str = re.sub(r'(str)eet|(ave)nue|(b)oulevard', r'\1 \2 \3lvd', a)
re.sub(r'\s+|\blvd', '', new_str)

The code looks kinda funny though.

Chad Davis
  • 174
  • 3
  • Hmm... How does [this example at regex101 strike you](https://regex101.com/r/38q300/2)? – SamWhan Sep 28 '16 at 14:10
  • @ClasG, as I said, funny (not good). That's why I added the line of code which removes any whitespaces or sequences 'lvd' with a word boundary immediately on the left. – Chad Davis Sep 28 '16 at 15:29
  • @ClasG, ah, I see. My test cases (the variables s, a, and b) didn't cover full sentences, and I see that that's pretty unrealistic. – Chad Davis Sep 28 '16 at 15:38