0

I'm trying to write a PCRE Regex pattern to match only the numbers of length (8 or 9 or 12) while leaving the last 4 digits unmasked. Below is my written regex pattern.

/(?<=\D|^)(?=\d{8,12}\D|$)(\d{3})[\s,-]?\[\s,-]\K\d{4}|(?<=\D|^)(?=\d{12}|\d{9}\D|$)\d{5}/gmi

Reference Regex101 link : https://regex101.com/r/DukWNG/1 This link has the working test cases for understanding.

Right now it matches the first 5 digits only but for length 8 or 9 or 12, it should match all the digits except leaving the last 4 digits.

Test case :

  1. if the length of the number is 8, then it should match first 4 leaving the last 4 unmasked
  2. if the length of the number is 9, then it should match first 5 leaving the last 4 unmasked
  3. if the length of the number is 12, then it should match first 8 digits leaving the last 4 unmasked.

and it should also match if there is a "hyphen" or "comma" or "dash" or "space" between numbers. I'm stuck and not sure how to make this work. Any help would be really great.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
Kamesh
  • 75
  • 6
  • Looks like `(?=\d{8,12}\D|$)` must be `(?=\d{8,12}(?:\D|$))`, and generally, you need to group alternations. `(?<!\d)` is shorter than `(?<=\D|^)`, use that. – Wiktor Stribiżew Mar 17 '23 at 12:16
  • @WiktorStribiżew, `/(?<!\d)(?=\d{8,12}(?:\D|$))(\d{3})[\s,-]?\[\s,-]\K\d{4}|(?<=\D|^)(?=\d{12}|\d{9}\D|$)\d{5}/gmi` These changes still not giving me results. Can you show me about group alternations? – Kamesh Mar 17 '23 at 12:34
  • @Kamesh I am toying with this like: https://3v4l.org/KSuS6 and https://3v4l.org/U3CI2 . Perhaps you should clarify the scope/variability of your data by posting a comprehensive and challenging battery of sample strings and your exact desired output. The better your [mcve], the better we can help you. – mickmackusa Mar 17 '23 at 23:00
  • On the topic of redaction: [How to replace the mobile number with stars except last 4 digits in php](https://stackoverflow.com/q/56986004/2943403) and [Obscure email addresses in a sentence](https://stackoverflow.com/q/59306540/2943403) and [Hint or partially hide email address with stars (*) in PHP](https://stackoverflow.com/q/43762251/2943403) and [How to redact sensitive substring following a specific substring?](https://stackoverflow.com/q/28150311/2943403) – mickmackusa Mar 17 '23 at 23:07
  • @mickmackusa, So this won't be placed under a pHP code, its a genesys designer tool where I will put this regex and it will automatically mask the numbers except last 4. – Kamesh Mar 18 '23 at 07:34
  • 2
    This question has a php tag. – mickmackusa Mar 18 '23 at 09:34

2 Answers2

2

This pattern seems to make the job:

~
(?<d> [0-9] [-_ –.]* ){4} # subpattern definition

(?= 
    \g<d>{4} (?: \g<d> (?: \g<d>{3} )? )? # allowed digit sequences
    (?: \g<d>+ (*SKIP) (*F) )? # skip the substring if digits remain
)

\g<d>* (?= \g<d>{4} ) # backtrack until there're 4 digits at the end
~ux

demo

Feel free to define yourself how eventual separators should exactly look like.

Interesting thing about this pattern, it doesn't need to be anchored at the start (with ^ or a lookbehind) since there are already two constraints:

  • the four digits at the start and the four other digits in the lookahead (that discards sequences with less than 8 digits)
  • and the \g<d>+ (*SKIP) (*F) control verbs combo that jumps after the remaining digits when the sequence doesn't match (with 8, 9 or 12 digits).
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
1

((?:\d[-_ \.]*){4,5}|(?:\d[-_ \.]*){8})(?:\d[-_ \.]*){4}$

https://regex101.com/r/9WmkNz/1

Oliver Hao
  • 715
  • 3
  • 5
  • the only problem is when we type more than 12 digits that entire number should not be matched. But with your solution it is matching the last 12 digits leaving the last 4 unmatched. how can we re-write it? – Kamesh Mar 17 '23 at 13:28
  • @Kamesh `^((?:\d[-–_ \.]*){4,5}|(?:\d[-–_ \.]*){8})(?:\d[-–_ \.]*){4}$` – Oliver Hao Mar 19 '23 at 06:46