1

I implemented the replaceAll() method with matcher, which replaces all punctuations with "". But it always throws an exception: "java.lang.StringIndexOutOfBoundsException: String index out of range: 6"

private static StringBuilder filterPunctuation(StringBuilder sb){
    Pattern pattern =  Pattern.compile("(\\.)");
    Matcher matcher = pattern.matcher(sb);
    while(matcher.find()){
        sb.replace(matcher.start(), matcher.end(), "");  
// if sb.replace(matcher.start(),matcher.end()," "), it wil be right, but I want replace all punction with ""
    }
    return sb;
}

public static void main(String[] args){
    System.out.println(filterPunctuation(new StringBuilder("test.,.")));
}
Anuj Balan
  • 7,629
  • 23
  • 58
  • 92
remy
  • 1,255
  • 6
  • 20
  • 27
  • Why don't you show your implementation here? – noMAD Apr 23 '12 at 04:19
  • 2
    Why would you do that? Why not just use the API that's there? Or is this "homework", in which case add that tag please – Bohemian Apr 23 '12 at 04:20
  • 2
    @Bohemian: There is no replaceAll for StringBuilder (of course, one does not lose much by going via String, I guess). – Thilo Apr 23 '12 at 04:23
  • because I want to filter some punctuations, I guess it's will cost too many memories with string.replaceall() – remy Apr 23 '12 at 04:44
  • @remy: I would assume string.replaceAll to be implemented reasonably efficiently. – Thilo Apr 23 '12 at 05:26

2 Answers2

6

If you are going to change the StringBuilder (especially its length by removing characters) inside of the loop, you are going to need to get a new Matcher (because the old one will continue to look at the original buffer or an inconsistent mix of both).

Take a look at how Jon Skeet would do it.

Community
  • 1
  • 1
Thilo
  • 257,207
  • 101
  • 511
  • 656
2

I would assume this to do the trick

private static void filterPunctuation(StringBuilder sb)
{
    int l=sb.length();
    for (int i=0; i<l; i++) if (sb.charAt(i)=='.') sb.deleteCharAt(l--);
}

No need to return it as you are working on the same reference.

BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
  • +1 I like simple. You should probably modify your code to show how it would handle commas, brackets etc – Bohemian Apr 23 '12 at 05:03
  • That might become one ugly || chain ;) remy's original code doesnt take them into account so far, hence I omitted them –  Apr 23 '12 at 05:05