2

Wildcard search with grep

I have a file that contains many IP addresses. I wanted to list all the ip addresses in the file and I used grep with a pattern 192.16* but it doesn't show the whole list of IP addresses. I am able to list the whole IP addresses only while using period followed with an asterisk symbol. So my doubt is Why 2nd option not working but 3rd option works fine.

root@test:~/test# cat z
192.168.1.0
192.168.2.0
192.168.110.7
192.168.115.5

1. root@test:~/test# grep -o 192.1 z
192.1
192.1
192.1
192.1


2. root@test:~/test# grep -o 192.1* z
192.1
192.1
192.1
192.1

3. root@test:~/test# grep -o 192.1. z
192.16
192.16
192.16
192.16


4. root@test:~/test# grep -o 192.1.* z
192.168.1.0
192.168.2.0
192.168.110.7
192.168.115.5
theG
  • 150
  • 1
  • 4
  • 10

2 Answers2

6
  • A dot (.) matches any character, you have to escape it: \..

  • -o shows only the matching part, if you ommit .* (= any characters) from the end, it will ommit the rest of the line (as it's not part of the matched string).

  • .* can match a lot more than you need (it will match the rest of the line), prefer to say explicitly what you allow: [0-9.]*.

  • Make sure you put the search expression in single quotes '192.168\.[0-9.]*', otherwise the shell will interpret the special characters, and substitute the expression with the matched filenames (luckily you didn't have any matching filenames).

  • You might only want to searh for words (-w). If you want to make sure you only match IP addresses and not something that resembles to it (no consecutive dots, exactly 4 digits, <=255...) then you'll need a lot more complex expression.

Karoly Horvath
  • 94,607
  • 11
  • 117
  • 176
3

Why your commands are (not) working:

1. root@test:~/test# grep -o 192.1 z

Only 192<any char>1 will be matched, and only the matching part will be printed because of the -o switch.

2. root@test:~/test# grep -o 192.1* z

Only 192<any char>, 192<any char>1, 192<any char>11, 192<any char>111 etc. will be matched, and only the matching part will be printed because of the -o switch. Your input does not contain data where this makes any difference.

3. root@test:~/test# grep -o 192.1. z

Only 192<any char>1<any char> will be matched, and only the matching part will be printed because of the -o switch. This gives you one more character (. stands for "any" single character).

4. root@test:~/test# grep -o 192.1.* z

Any line starting with 192<any char>1 will be matched, and only the matching part will be printed because of the -o switch. .* matches anything up to the end of the line, including the empty string.

Regular expression for IP addresses

You can find lots of IP address regular expressions on the web, see for example this StackOverflow question. Note however that some of the expressions are used to match only the IP address and therefore contain beginning- (^) and end-of-line ($) characters. You will have to remove those if your input contains more than just the addresses.

Community
  • 1
  • 1
Michael Jaros
  • 4,586
  • 1
  • 22
  • 39
  • thank you for the detailed explanation but I am still confused with the second one, you said: `"Only 1921, 19211, 192111 etc. will be matched"` but why it cant be : `"Only 1921, 19216, 192168 etc."` as far as i know, `*` means anything that includes all the IP addresses in the 192.168 region too. – theG May 04 '15 at 16:13
  • In (both ["basic" and "extended"](http://www.gnu.org/software/grep/manual/html_node/Basic-vs-Extended.html)) regular expressions, `*` is a quantifier. That means that it denotes, _how often_ the preceding character must be present in the input to match the pattern. `*` stands for `zero or more times`. So in example 2, `1*` means `zero or more "1" characters`. However, I've just noticed that that my answer was not exact about the "zero" part, I've edited that. See manpage `regex(7)` for more info about regular expressions used by `grep`. – Michael Jaros May 04 '15 at 17:25
  • 1
    Thank you for elaborating. It is clear for me now :) – theG May 07 '15 at 05:09