@MurrayW's answer is excellent, but I would like to add some background information.
Regex as Finite State Automata
When I first learned regular expressions in university, we learned to convert them to finite state automata, essentially compiling them into graphs that were then processed to match the string. When you do that, (cd|c)
and (c|cd)
get compiled into the same graph, in which case both of your regular expressions would match the whole string. This is what grep
actually does:
Both
echo abcdcdd | grep --color -E 'ab(c|cd)*d'
and
echo abcdcdd | grep --color -E 'ab(cd|c)*d'
color the whole string in red.
Patterns we call "regular expressions"
True finite state automata have many limitations that programmers don't like, such as the inability to capture matching groups, of to reuse those groups later in the pattern, and other limitations I forget, so the regular expression libraries that we use in most programming languages implement more complex formalisms. I don't remember that they are exactly, maybe push-down automata, but we have memory, we have backtracking, and all sorts of good stuff we use without thinking about it.
At the risk of seeming pedantic, the patterns we use are not "regular" at all. I know, the difference is usually not relevant, we just want our code to work, but once in a while it matters.
So, while the regular expressions (cd|c)
and (c|cd)
would be compiled into the same finite state automaton, those two (non-regular) patterns are instead turned into logic that says try the variants from left to right, and backtrack only if the rest of the pattern fails to match later, hence the results you observed.
Speed
While the patterns our "regular expression" libraries support offer us lots of goodies we like, those come at a performance cost. True regular expressions are blazingly fast, while our patterns, though usually fast, can sometimes be very expensive. Search for "catastrophic backtracking" on this site for many examples of patterns that take exponential time to fail. The same patterns, used with grep
, would be compiled into a graph that is applied in linear time to the string to match no matter what.