37

I already understand that .* means zero or more of any character, but Could someone explain how .* in the following work and what it would match?

.*([a-m/]*).*

.*([a-m/]+).*

.*?([a-m/]*).*
Purag
  • 16,941
  • 4
  • 54
  • 75
OpMt
  • 1,723
  • 4
  • 19
  • 24
  • 4
    In Regex, `.` refers to any character, be it a number, an aplhabet character, or any other special character. `*` means zero or more times. – asgs Oct 01 '12 at 02:08
  • 3
    It's simple enough - any symbol, present zero or more times - but there's a *ton* of nuances under that. What's more, it's an extremely central concept in regular expressions. Go out right now and read a backgrounder on regular expressions. You'll get further, faster, that way. – Michael Petrotta Oct 01 '12 at 02:09

4 Answers4

28

the dot means anything can go here and the star means at least 0 times so .* accepts any sequence of characters, including an empty string.

Ionut Hulub
  • 6,180
  • 5
  • 26
  • 55
16

Each case is different:

.*([a-m\/]*).*

The first .* will probably match the whole string, because [a-m/] is not required to be present, and the first * is greedy and comes first.

.*([a-m\/]+).*

The first .* will match the whole string up to the last character that matches [a-m/] since only one is required, and the first * is greedy and comes first.

.*?([a-m\/]*).*

The first .*? will match the string up to the FIRST character that matches [a-m/], because *? is not greedy, then [a-m/]* will match all it can, because * is greedy, and then the last .* will match the rest of the string.

LMB
  • 1,137
  • 7
  • 23
  • 1
    There is an error in the third case. As you had written yourself `[a-m/]` is followed by a `*` and not a `+` as you considered afterwards. So according to your words for the first case "`[a-m/]` is not required to be present". Which one takes the priority then ? – Atralb May 21 '20 at 01:07
  • The third example, the last `.*` is not greedy because there is no option to be `greedy`, right? – Timo Mar 24 '21 at 18:10
13

The function of .* in your examples is to make sure that the containing expression could be surrounded with anything (or nothing). The dot represents an arbitrary character, and the asterisk says that the character before can be repeated an arbitrary number of times (or not at all).

Jimmy C
  • 9,270
  • 11
  • 44
  • 64
11

.* means "any character, any number of repetitions."

XIVSolutions
  • 4,442
  • 19
  • 24