I think Regular Expressions are a great way to filter text out of a given text.
This regex gets the File, Filename and Extension from the given text.
href="(?<File>(?<Filename>.*?)(?<Ext>\.\w{1,3}))"
Regex above expects an extension that exists out of word characters a-z A-Z 0-9, between 1 and 3 characters.
C# Code sample:
string regex = "href=\"(?<File>(?<Filename>.*?)(?<Ext>\\.\\w{1,3}))\"";
RegexOptions options = ((RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline) | RegexOptions.IgnoreCase);
Regex reg = new Regex(regex, options);