I am using a script to check links on a given page. I am using simple html DOM to parse the information into an array. I have to check the href of all the a tags to find if they contain a file or something like # or JS.
I tried the following without success.
if(preg_match("|^(.*)|iU", $href)){
save_link();
}
I dont know it my pattern is wrong or if there is a better method to complete this function.
I want to be able to detect if $href contains .com .php .file extensions. This way it will filter out items like # "function()" and other items used in the href attribute.
EDIT: parse_url will not work stop posting it. The value # returns as a valid url like I stated above I am trying to look for any string followed by .* with no more than 4 chars following the .