While trying this, what you probably forgot was the -E
argument for sed
.
From sed --help
:
-E, -r, --regexp-extended
use extended regular expressions in the script
(for portability use POSIX -E).
You don't have to change your regex significantly, but you do need to add .*
to match greedily around it to remove the other part of string.
This works fine for me:
echo "first url, second url, third url" | sed -E 's/.*second (url).*/\1/'
Output:
url
In which the output "url" is actually the second instance in the string. But if you already know that it is formatted in between comma and space, and you don't allow these characters in URLs, then the regex [^,]*
should be fine.
Optionally:
echo "first http://test.url/1, second ://test.url/with spaces/2, third ftp://test.url/3" \
| sed -E 's/.*second ([a-zA-Z]*:\/\/[^,]*).*/\1/'
Which correctly outputs:
://example.com/with spaces/2