0

Suppose there is a text file test.txt. It contains text and links to resources such as https://example.com/kqodbjcuic49w95rofwjue. How can I extract only the list of these links from there? (preferably via bash, but not required)

I tried this solution:

sed 's/^.*href="\([^"]*\).*$/\1/'

But it didn't help me.

Marcin Orlowski
  • 72,056
  • 11
  • 123
  • 141

1 Answers1

0
grep -o "/((?:(?:http|ftp|ws)s?|sftp):\/\/?)?([^:\/\s.#?]+\.[^:\/\s#?]+|localhost)(:\d+)?((?:\/\w+)*\/)?([\w\-.]+[^#?\s]+)?([^#]+)?(#[\w-]*)?/gm" test.txt

will display all URLs inside the file.

(The regex comes from BSimjoo's link)

Grep text files guide at https://www.linode.com/docs/guides/how-to-grep-for-text-in-files/

Gaël James
  • 157
  • 13