I'm using the regex.h library for my C program.
I need to download all files whose link is stored in tag in html data. So my first task is to extract its contents of "href" property.
I use this address to pactice http://students.iitk.ac.in/programmingclub/course/lectures/
In its html content, there are many tag like
<a href="1.%20Introduction%20to%20C%20language%20and%20Linux.pdf">
<a href="1.%20Introduction%20to%20C%20language%20and%20Linux.ppt">
<a href="1.%20Introduction%20to%20C%20language%20and%20Linux.pptx">
...
I write a regex string to extract the content in "href" property
char regex[] = "href=\"([a-zA-Z0-9%.,]*\\.[a-zA-Z0-9]*{1,4})\"";
What I expect for the regex (I can handle full match and group match myself).
1.%20Introduction%20to%20C%20language%20and%20Linux.pdf
1.%20Introduction%20to%20C%20language%20and%20Linux.ppt
1.%20Introduction%20to%20C%20language%20and%20Linux.pptx
...
What I receive is only the first link (I just care about group match).
1.%20Introduction%20to%20C%20language%20and%20Linux.pdf
Nice day and thank you very much.
ps: I use REG_EXTENDED for regcomp()