I'm working on building a Java program that will download a copy of a website to a local machine while maintaining the original file hierarchy.
I'm using the following: To find CSS of form http://www.w3schools.com/css/css_howto.asp (note working)
private static final String HTML_CSS_TAG_PATTERN = "\\s*(?i)href\\s*=\\s*(\"([^\"]*\")|'[^']*'|([^'\">\\s]+))";
private static final String CSS_TAG_PATTERN = "(?i)<link([^>]+)>(.+?)>";
To find images (working fine):
private static final String HTML_IMG_TAG_PATTERN = "\\s*(?i)src\\s*=\\s*(\"([^\"]*\")|'[^']*'|([^'\">\\s]+))";
private static final String IMG_TAG_PATTERN = "(?i)<img([^>]+)>(.+?)>";
To find links of form http://www.w3schools.com/html/html_links.asp (working fine)
private static final String HTML_A_HREF_TAG_PATTERN = "\\s*(?i)href\\s*=\\s*(\"([^\"]*\")|'[^']*'|([^'\">\\s]+))";
private static final String HTML_A_TAG_PATTERN = "(?i)<a([^>]+)>(.+?)</a>";
The link and images are working fine, but the CSS file isn't. I would like it to extract the link to the CSS file so that I can save it. Could anyone help me with what I missed?