\b not working as expected with sed command

Question

I am trying to solve one task given to me by using sed only. The task is:

Given lines of credit card numbers, mask the first digits of each credit card number with an asterisk (i.e., *) and print the masked card number on a new line. Each credit card number consists of four space-separated groups of four digits. For example, the credit card number 1234 5678 9101 1234 would be masked and printed as **** **** **** 1234.

I have successfully used the following command. It is working as expected and printing the desired output.

sed 's/\([0-9]\{4\}\s\)\{3\}\([0-9]\{4\}\)/**** **** **** \2/'

However, I was trying another solution with \b and it is not working. I am not able to understand why it is not working. \b should match the beginning and the space between the words. I know it can be solved with \s but I want to understand what's wrong with the solution with \b only.

sed 's/\(\b[0-9]\{4\}\b\)\{3\}\([0-9]\{4\}\)/**** **** **** \2/'

NOTE: Since I have a working solution for it. I just want to understand why my solution using \b is not working.

What version of sed are you using? Different versions recognize different regular expressions. — glenn jackman, Nov 21 '19 at 17:54
You might consider: `sed -E 's/[[:digit:]]{4}([^[:digit:]])/****\1/g'` — glenn jackman, Nov 21 '19 at 17:55
`sed (GNU sed) 4.7`. I just want to understand why \b version doesn't work. — abhiarora, Nov 21 '19 at 18:06
For some reason my answer was also downvoted. I have upvoted question to neutralize an unnecessary downvote. — anubhava, Nov 21 '19 at 18:24
After reading the documents, i think my question isn't duplicate because the problem with my expression was I couldn't understand how repetition works in regular expression and how word boundary behaves. So, the issue is how an expression behave when repetition and word boundary are used together. Thanks anyone! — abhiarora, Nov 21 '19 at 18:47

anubhava · Accepted Answer · 2019-11-21T18:33:20.630

2

\b does work in gnu sed but your 2nd regex is incorrect.

You should be using:

sed 's/\b\([0-9]\{4\}\s\)\{3\}\([0-9]\{4\}\)/**** **** **** \2/' file

or with -E

sed -E 's/\b([0-9]{4}\s){3}([0-9]{4})/**** **** **** \2/' file

Note that second \b should be replaced with \s (whitespace) since your inout text has spaces between numbers.

Here is a good article on Word Boundaries

edited Nov 21 '19 at 18:33

answered Nov 21 '19 at 18:10

anubhava

761,203
64
569
643

What's wrong with it? I know I can do it with `\s` as I have already solved the problem but not sure why `\b` doesn't work inside regex sub-expression? – abhiarora Nov 21 '19 at 18:12
1

`\b` does work as you can see in my regex as well. But `\b` is a word boundary and it cannot match a space. You input has `1234 5678 9101 1234` so you need to match space between numbers as well. – anubhava Nov 21 '19 at 18:14
1

So even `sed 's/\(\b[0-9]\{4\}\b\s\)\{3\}\([0-9]\{4\}\)/**** **** **** \2/'` will work for you but it will be a bit inefficient because of redundant `\b` – anubhava Nov 21 '19 at 18:15
1

Thanks for the answer. The problem is `\b` only matches the position not the character. Your last comment helped me understand that. Thanks – abhiarora Nov 21 '19 at 18:22
[Here is good read on word boundaries](https://www.regular-expressions.info/wordboundaries.html) – anubhava Nov 21 '19 at 18:26
1

Thanks. It can clear all of my questions. You can add that link to your answers as well. – abhiarora Nov 21 '19 at 18:30

\b not working as expected with sed command

1 Answers1