I have a file containing one number per line.
data.txt
00700
00100
07070
01010
07700
07770
70000
10000
00007
I want to print lines that contain a digit 7
and at the same time do not contain its pair 77
. I wrote a simple script for this.
script.sh
#!/usr/bin/env bash
cat data.txt | grep -E '\d*7' | grep -v -E '\d*77'
Quick explanation of the regular expression above:
- First take any digit zero or more times.
- Then take 7.
- If the line satisfies the rules above run the search again.
- Take any digit zero or more times.
- Take double 7.
- If the line again satisfies the rules, remove it from the selection (the
-v
option inverts the selection).
This works fine and outputs the desired result.
output
00700
07070
70000
00007
However I had to start the grep
program twice. I then tried a different regular expression.
script.sh
#!/usr/bin/env bash
cat data.txt | grep -E '\d*7[0-689]?\d*'
Which should in my understanding:
- Take any number zero or more times.
- Then
7
. - Then any number except 7 zero or once.
- Then any number until the end of the line.
However it also selects the lines that contain 77
.
output
00700
07070
07700
07770
70000
00007
Is there a better way that starts grep
or any other program that uses regular expressions only once?