links = re.findall(r'\w+://\w+.\w+.\w+\w+\w.+"', page)
to parse the links from a webpage.
Please any help will be appreciated. This is what I get from parsing http://www.soc.napier.ac.uk/~cs342/CSN08115/cw_webpage/index.html:
#my current output#
http://net.tutsplus.com/tutorials/other/8-regular-expressions-you-should-know/"
http://www.asecuritysite.com/content/icon_clown.gif" alt="if broken see alex@school.ac.uk +44(0)1314552759" height="100"
http://www.rottentomatoes.com/m/sleeper/"
http://www.rottentomatoes.com/m/sleeper/trailer/"
http://www.rottentomatoes.com/m/star_wars/"
http://www.rottentomatoes.com/m/star_wars/trailer/"
http://www.rottentomatoes.com/m/wargames/"
http://www.rottentomatoes.com/m/wargames/trailer/"
https://www.sans.org/press/sans-institute-and-crowdstrike-partner-to-offer-hacking-exposed-live-webinar-series.php"> SANS to Offer "Hacking Exposed Live"
https://www.sans.org/webcasts/archive/2013"
#I want to get this when i run the module#
http://net.tutsplus.com/tutorials/other/8-regular-expressions-you-should-know/
http://www.asecuritysite.com/content/icon_clown.gif
http://www.rottentomatoes.com/m/sleeper/
http://www.rottentomatoes.com/m/sleeper/trailer/
http://www.rottentomatoes.com/m/star_wars/
http://www.rottentomatoes.com/m/star_wars/trailer/
http://www.rottentomatoes.com/m/wargames/
http://www.rottentomatoes.com/m/wargames/trailer/
https://www.sans.org/press/sans-institute-and-crowdstrike-partner-to-offer-hacking-exposed-live-webinar-series.php
https://www.sans.org/webcasts/archive/2013