2

In brief:

Is there a way using pup to limit the number of results not overall, but at the tag level?

Backstory/use-case:

Ever since I learned about pup I've been obsessed. I'm constantly thinking of new use cases. This morning I wanted to use it to grab the latest headlines from ESPN.

ESPN seems to have an unordered list like this: <ul class="headlines"> and then a bunch of list items.

A simple solution would be:

$ curl -s -S http://espn.go.com/ | pup .headlines a text{}

right? But, as you can see, there are sometimes multiple links to each topic per line with alternate authors, so then you end up with results like "Low", "Anande", "Stark", and "Dinich" (last names of ESPN authors).

Ideally I'd like to do something like this:

$ curl -s -S http://espn.go.com/ | pup .headlines li a slice{:1} text{}

but that only returns the first result. :\

There are multiple <a> tags per <li>, so I'd like to retrieve all of the <li> items, but limit the number of <a> tags to 1 per <li>. Is this possible?

kenorb
  • 155,785
  • 88
  • 678
  • 743

1 Answers1

4
$ curl -s -S http://espn.go.com/ | pup '.headlines li a:first-of-type text{}'
eric chiang
  • 2,575
  • 2
  • 20
  • 23