1

I did the following regular expression in java:

(^(?!\\s+$).*[^\\/:*?\"<>|]+(\\.(?i)(txt|rtf|doc|docx|htm|html|pdf))$)

But i need to use it in a xml schema, so i've changed for:

(^(?!\s+$).*[^\/:*?\&quot;&lt;&gt;|]+(\.(?i)(txt|rtf|doc|docx|htm|html|pdf))$)

But it's version is accepting files with not listed extensions. What's wrong?

Márcio
  • 57
  • 1
  • 7
  • why don't you use any XML parsing API? – Braj Aug 29 '14 at 19:11
  • 1
    `;&lt` in char class doesn't mean the string `;&lt`. It means match ; or & or l or t – Avinash Raj Aug 29 '14 at 19:12
  • 1
    The obligatory comment to all posts asking about using regex to parse XML: http://stackoverflow.com/a/1732454/18157 – Jim Garrison Aug 29 '14 at 19:19
  • @JimGarrison He doesn't want to parse XML with regex, he wants to store the regex pattern in an XML file... Not the same thing at all ;) EDIT: Well that was my first interpretation, now I'm not so sure anymore... the question isn't particularly clear. – Lucas Trzesniewski Aug 30 '14 at 00:05

2 Answers2

3

Your regular expression shouldn't match anything: for a start, "^" and "$" are not meta-characters in the XSD regex dialect, they match literal "^" and "$" characters. There are also many other constructs here that XSD regexes don't allow. However, you may be using a schema processor such as the Microsoft one that does it's own thing rather than following the W3C specification.

It would be easier for us if you described your requirement, rather than asking us to work it out by reverse engineering a complex regular expression. Don't forget that you can specify more than one pattern. If you just want to require one of the specified extensions, just use

.*\.(txt|rtf|doc|docx|htm|html|pdf)

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
0

Only for .xml files extension matching, I have sent regular expression as a string argument as below,

String xmlRegex=new String(".*\\.(xml)");

and verifies as below,

boolean matched = MatcherUtil.match( xmlRegex, filename );

It returns true if matched and false if not.

Amulya Koppula
  • 150
  • 1
  • 5