3

I have a simple regex to remove undesired chars(like numbers for instance) from a string:

preg_replace(/[^a-z_]/, '', $str);

But now it must keep a prefix that contains some of those chars that were being removed. That prefix format consists of an alpha char followed by four numbers(prefix exemple: b1234). String example:

b7001_cp_parc_venc_fluxo

So i tried to add a non-capturing group for that prefix, but i can't make it to work. Attempted to use things like:

(?:b[0-9]{4})[^a-z_]

But it stops removing numeric chars in the whole string, for example.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
DontVoteMeDown
  • 21,122
  • 10
  • 69
  • 105

2 Answers2

3

One option is making use of SKIP FAIL.

b[0-9]{4}(*SKIP)(*F)|[^a-z_]
  • b[0-9]{4}(*SKIP)(*F) Match what you want to avoid
  • | Or
  • [^a-z_] Match any char other than a-z or _

You can also repeat the character class 1+ times [^a-z_]+ to get a single match for consecutive characters.

Regex demo | Php demo

Example

$pattern = "/b[0-9]{4}(*SKIP)(*F)|[^a-z_]/";
$str = "b7001_cp_parc_venc_fluxo_1234";
echo preg_replace($pattern, "", $str);

Output

b7001_cp_parc_venc_fluxo_
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
1

Since you are removing the found matches, you can also use

preg_replace('~^b[0-9]{4}\K|[^a-z_]+~', '', $string)

See the regex demo.

Details

  • ^b[0-9]{4}\K - finds b and four digits at the start of a string and then \K operator discards the matched text and thus nothing gets replaced
  • | - or
  • [^a-z_]+ - matches one or more chars other than lowercase ASCII letters or underscore and these matches are removed.

PHP demo:

$string = 'b7001_cp---_parc1323546_venc.,?><_     fluxo';
echo preg_replace('/^b[0-9]{4}\K|[^a-z_]/', '', $string);
// => b7001_cp_parc_venc_fluxo
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563