New to bash scripting, The previous answers didn't helped me.
I am trying to harvest ids from web pages and I need to parse page1
, get a list of ids, and use them to parse corresponding web pages.
The thing is I'm not sure how to write the script...
Here's what I would like to do:
- Parse
url1
according toregexp
. Output: list of extracted ids (101
,102
,103
, etc). - Parse each url with output id, for example: parse (
http://someurl/101
), then parse (http://someurl/102
), etc.
So far, I have come up with this command:
curl http://subtitle.co.il/browsesubtitles.php?cs=movies | grep -o -P '(?<=list.php\?mid=)\d+'
The command above works, and gives a list of ids.
Any advice for the next steps? Am I on the right track?
Thanks!