3

I'm trying to get a 'teaser' of a given String and put it as value into a HashMap. With 'teaser' I mean a substring (max length 50 characters) ending a word boundary.

Here's a code sample showing how I'm trying to do it:

import java.util.regex.*;             

public class Test {                    
  public static void main(String[] args) throws Exception {
    final Pattern pattern = Pattern.compile("(^.{0,50}\b)"); 
    final Matcher m = pattern.matcher(
        "This is a long string that I want to find a shorter teaser for."); 
    if (m.find()) {
      System.out.println("Found: " + m.group(1)); 
    } else {  
      System.out.println("No match");   
    }                                                          
  }             
}    

I expected it to print:

Found: This is a long string that I want to find a

But instead it prints:

No match

If I test this regex seperately it does what it should - it finds a substring of value which has a max length of 50 characters and ends on word boundary. But if I debug it, m.find always gets me a false.

Any ideas how to solve this? (I'm focused on getting the teaser, not on using Matcher.find() ;-) )

VLAZ
  • 26,331
  • 9
  • 49
  • 67
anna
  • 33
  • 4

1 Answers1

3

According to Oracle documentation on Characters \b is the escape sequence for backspace within a String. However you want \b the regex for word boundary so you need to change the slash to a literal slash, i.e. \\ so that Pattern.compile sees the \b

Pattern.compile("(^.{0,50}\\b)")

You can see this effect by calling .toCharArray() on a String

Single slash

System.out.println(Arrays.toString("\b".toCharArray()));
=> []

Double slash

System.out.println(Arrays.toString("\\b".toCharArray()));
=> [\, b]
Adam
  • 35,919
  • 9
  • 100
  • 137
  • D'oh! Thanks so much, the second slash was wat was missing.. Works perfectly fine now! – anna Feb 12 '15 at 08:58