0

Here is my code to look for adjacent words around a proper noun. I get an error for this code mentioned below. I've tried adding different permutations and combinations of backslashes and closing brackets but still get the error. Will appreciate help.

 for (String properNoun : properNouns){

             Pattern pattern = Pattern.compile("([^\\s]+\\s+[^\\s]+)\\s+"+properNoun+"\\s+([^\\s]+\\s+[^\\s]+)\\s+");
             Matcher matcher = pattern.matcher(sentence);

             while (matcher.find()){
................

The error I get is:

Exception in thread "main" java.util.regex.PatternSyntaxException:
Unclosed character class near index 44
([^\s]+\s+[^\s]+)\s+[]\s+([^\s]+\s+[^\s]+)\s+

Sentence - Cyprium Mining is pleased to announce that at a special meeting of debenture holders held on September 21 , 2016 ( the " Meeting " ) the holders of dollars 750,000 in principal_amount of unsecured debentures bearing interest at 12 % per_annum ( the " Debentures " ) overwhelmingly approved all matters presented , including the extension of the maturity_date from February 28 , 2017 to February 28 , 2019 .

properNouns - [[], [Cyprium], [February], [the Debentures], [September]]

serendipity
  • 852
  • 13
  • 32

1 Answers1

3

properNoun value is [], this is a character class, it should contains some characters, if you want to match [], you should escape this, like: \\[\\].

As your data, maybe you can do it like:

Pattern pattern = Pattern.compile("([^\\s]+\\s+[^\\s]+)\\s+"+Pattern.quote(properNoun)+"\\s+([^\\s]+\\s+[^\\s]+)\\s+");

Pattern.quote:

Returns a literal pattern String for the specified String

chengpohi
  • 14,064
  • 1
  • 24
  • 42
  • I'm looping over proper nouns and they contain characters. Updated the question to add those values. The first value in my proper noun set is null..could that be the reason for the error? – serendipity Dec 26 '16 at 08:48
  • @serendipity maybe you can use `Pattern.quote` to escape the str – chengpohi Dec 26 '16 at 08:52