5
$ lynx --dump -listonly index.html

Example result:

References

Visible links
1. http://lynx.invisible-island.net/
2. http://lynx.invisible-island.net/lynx.html
3. http://lynx.invisible-island.net/current/index.html

What I want to do is remove the 1. 2. and 3. "References" and "Visible Links" text included.

Wanted Result:

http://lynx.invisible-island.net/
http://lynx.invisible-island.net/lynx.html
http://lynx.invisible-island.net/current/index.html
  • What did you try for yourself? Post them even if they are trivial – Inian Apr 06 '18 at 20:12
  • 1
    You can pipe it to `sed` to remove the initial digits. – Barmar Apr 06 '18 at 20:13
  • sed, awk, and grep all confuse me, I've already solved this problem, but it involves pasting the index.html to https://www.browserling.com/tools/extract-urls . I'm gonna have to do this in the future, and would like to put this into a nice single script. – Caucasian Malaysian Apr 06 '18 at 20:17
  • @CaucasianMalaysian I think you mean that regular expressions confuse you. – Tripp Kinetics Apr 06 '18 at 20:24
  • 1
    @CaucasianMalaysian: Suggest following a good tutorial to learn how to use them if you are planning to work with them. For now you could pipe the output to `sed` as `lynx .. | sed -n 's/^[[:digit:]]\.[[:space:]]\(.*\)$/\1/p'` or for numbers greater than one digit do `sed -n 's/^\([[:digit:]]*\)\.[[:space:]]\(.*\)$/\2/p'` – Inian Apr 06 '18 at 20:24

3 Answers3

14

You can use -nonumbers option of Lynx

lynx --dump -nonumbers -listonly http://lynx.invisible-island.net/
poboxy
  • 151
  • 6
0

Try:

lynx --dump -listonly index.html | sed -r 's/^[0-9]+\. //'
Tripp Kinetics
  • 5,178
  • 2
  • 23
  • 37
0

I have this input, with spaces on top of each line:

 1. http://lynx.invisible-island.net/
 2. http://lynx.invisible-island.net/lynx.html

then, with the suppression of lines 1 to 3:

lynx --dump -listonly http://lynx.invisible-island.net/ | sed -E 's/^ ?+[0-9]+\. //; 1,3d'

output

http://lynx.invisible-island.net/
http://lynx.invisible-island.net/lynx.html
kyodev
  • 573
  • 2
  • 14