2

I would like to use a regular expression to mask all but the first three alphanumeric characters of each word in a string using a mask character (such as "x"), so "1 Buckingham Palace Road, London" would become "1 Bucxxxxxxx Palxxx Roax, Lonxxx".

Keeping the first three characters is easily done using

s/\b(\w{0,3})(.*)\b/$1/g

but I cannot seem to figure out how to insert length($2) times the masking character instead of $2.

Thanks!

Thilo-Alexander Ginkel
  • 6,898
  • 10
  • 45
  • 58

3 Answers3

4

C#:

new Regex(@"(?<!\b.{0,2}).").Replace("1 Buckingham Palace Road, London", "x");

Since you say it's language-agnostic, I trust this can be easily ported into your language of choice...

Or, you could just get the length of $2 and fill the x's the old fashioned way.

Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
1

Positive lookbehind, any word character with three word characters before it gets changed to an X:

s/(?<=\w{3})\w/$1x/g;

example perl script:

my $string = "1 Buckingham Palace Road, London"; 
$string =~ s/(?<=\w{3})\w/$1x/g; 
print qq($string\n);
Danalog
  • 559
  • 11
  • 21
0
use warnings;
use strict;

my $string = "1 Buckingham Palace Road, London";

$string =~ s(
  \b(\w{0,3})(\w*)\b
){
  $1 . ( 'x' x length $2 )
}gex;

print $string, "\n";
Brad Gilbert
  • 33,846
  • 11
  • 78
  • 129