2

I have generated a constant by regex alled punctuation that contains everything that is defined to be punctuation i.e.

PUNCTUATION = " !\"',;:.-_?)([]<>*#\n\t\r"

Only problem is that I am not sure how to use this to remove all leading and trailing punctuation from a specified word. I have tried methods like replaceAll and startsWith but have had no luck.

Any suggestions anyone?

ajduke
  • 4,991
  • 7
  • 36
  • 56
Rajiv V
  • 23
  • 3

3 Answers3

1

Completely untested, but should work:

public static String trimChars(String source, String trimChars) {
    char[] chars = source.toCharArray();
    int length = chars.length;
    int start = 0;

    while (start < length && trimChars.indexOf(chars[start]) > -1) {
        start++;
    }

    while (start < length && trimChars.indexOf(chars[length - 1]) > -1) {
        length--;
    }

    if (start > 0 || length < chars.length) {
        return source.substring(start, length);
    } else {
        return source;
    }
}

And you'd call it this way:

String trimmed = trimChars(input, PUNCTUATION);
Sean Bright
  • 118,630
  • 17
  • 138
  • 146
  • Yess that works well. I was trying to think along the same lines as you but somehow never put two and two together when it came to structuring the while loop where i was setting up my conditions differently. Thank you very much for your help :) – Rajiv V May 23 '13 at 16:36
  • And also thanks to everyone else who provided an answer. I saw some new techniques that way as well :) – Rajiv V May 23 '13 at 16:54
0
    String PUNCTUATION = " !\"',;:.-_?)([]<>*#\n\t\r";
    String pattern = "([" + PUNCTUATION.replaceAll("(.)", "\\\\$1") + "]+)";
    //[\ \!\"\'\,\;\:\.\-\_\?\)\(\[\]\<\>\*\#\t\n]
    pattern = "\\b" + pattern + "|" + pattern + "\\b";
    String text = ".\n<>#aword,... \n\t..# asecondword,?";
    System.out.println( text.replaceAll(pattern, "") );
    //awordasecondword

\b

is for word boundry.

Firstly you should put your characters in to [ ] (chracter class) and escape special characters.

"\b" + pattern

is for leading characters and

pattern + "\b"

is for trailing chracters.

yavuzkavus
  • 1,268
  • 11
  • 17
0

A method that clears all chars in a string from the start and end (this should be more time-efficient than applying regex patterns):

public class StringUtil {
    private static final String PUNCTUATION = " !\"',;:.-_?)([]<>*#\n\t\r";

    public static String strip(String original, String charsToRemove) {
        if (original == null) {
            return null;
        }

        int end = original.length();
        int start = 0;
        char[] val = original.toCharArray();
        while (start < end && charsToRemove.indexOf(val[start]) >= 0) {
            start++;
        }
        while (start < end && charsToRemove.indexOf(val[end - 1]) >= 0) {
            end--;
        }
        return ((start > 0) || (end < original.length())) ? original.substring(start, end) : original;
    }
}

Use like this:

assertEquals("abc", StringUtil.strip(" !abc;-< ", StringUtils.PUNCTUATION));
EmirCalabuch
  • 4,756
  • 1
  • 25
  • 20