Yesterday , I and my roommate discussed a question on the stack. And this questions is here
How to get the second column from command output?
They talk about how to separate the second column from the input stream like this:
1540 "A B"
6 "C"
119 "D"
And with the first upvoted answer
<some_command> | sed 's/^.* \(".*"$\)/\1/'
the result is perfectly satisfied with the request.
But then we find if we follow the greedy rule of regex, the pattern ^.*␣
will match 1540 "A
which confused my roommate. With the benefit of hindsight, the pattern ^.*␣
should make a compromise with the pattern (".*"$)
. Otherwise, the second pattern would match nothing. However, my roommate can't be convinced by my hypothesis. So this guy give me another example to test and we did do it.
We made two experiment. The 1st add a quote "
follow the character A like this:
1540 "A" B"
6 "C"
119 "D"
and it is easy to get this result with the previous regex code:
"A" B"
"C"
"D"
And for the 2nd one , we add a white space and a quote ␣"
follow the A like this:
1540 "A " B"
6 "C"
119 "D"
the result is:
" B"
"C"
"D"
Until now, my roommate got more confused, cause his focus always concentrate on the second pattern (".*"$)
. And in his mind, the pattern (".*"$)
should observer the same behavior between the two string 1540 "A" B"
and 1540 "A " B"
, so the second test's result should be "A " B"
not rather " B"
. And I think for the second one , it's sure that the pattern ^.*␣
can't match this part 1540 "A"
which will result in no match for the second pattern. But for the second experiment 1540 "A " B"
, the two choice "1540
and 1540 "A
seem all reasonable , the difference is that the former results from the greed of (".*"$)
, the latter thanks to ^.*␣
's.
So can anyone give me an answer more specifically to discern which is the key in our confusion. Thanks .