1

I'm dealing with text formatting from a plaintext message (HL7) and reformatting it for display. An example of one is \.sp5\. This means put in five line breaks.

So I'm thinking I would want to do something like this:

Pattern.compile("\\\.sp(\d+)\\").matcher(retval).replaceAll("\n{$1}");

My IDE is telling me that there is an invalid escape sequence at \d and I am not sure if the replaceAll argument will do what I expect. I think that regular expression is describing "backslash dot s p one-or-more-digits backslash" and I want the replacement to say "put in $1 line breaks".

How can I accomplish this?

The solution was a combination from two commenters below:

Pattern verticalSpacesPattern = Pattern.compile("\\\\\\.sp(\\d+)\\\\", Pattern.MULTILINE);
Matcher verticalSpacesMatcher = verticalSpacesPattern.matcher(retval);

while (verticalSpacesMatcher.find()) {
    int lineBreakCount = Integer.parseInt(verticalSpacesMatcher.group(1));
    String lineBreaks = StringUtils.repeat("\n", lineBreakCount);
    String group = verticalSpacesMatcher.group(0);
    retval = StringUtils.replace(retval, group, lineBreaks);
}
Freiheit
  • 8,408
  • 6
  • 59
  • 101

4 Answers4

1

Regular expressions in java require all slashes to be doubled. That's because "\" is a special character in strings, and needs to be escaped with an extra slash. So you probably want:

Pattern.compile("\\\\\\.sp(\\d+)\\\\").matcher(retval).replaceAll("\\n{$1}");
Dave DiFranco
  • 1,695
  • 10
  • 9
1

use this :

public static void main(String[] args) throws Exception {
            // Create a pattern to match comments
            Pattern p = 
                Pattern.compile("\\\\.sp(\\d+)", Pattern.MULTILINE);

            // Get a Channel for the source file
            File f = new File("Replacement.java");
            FileInputStream fis = new FileInputStream(f);
            FileChannel fc = fis.getChannel();

            // Get a CharBuffer from the source file
            ByteBuffer bb = 
                fc.map(FileChannel.MAP_RO, 0, (int)fc.size());
            Charset cs = Charset.forName("8859_1");
            CharsetDecoder cd = cs.newDecoder();
            CharBuffer cb = cd.decode(bb);

            // Run some matches
            Matcher m = p.matcher(cb);
     int i = 0;
    int n=0;
            while (m.find())
                n= Integer.parseInt(m.group(1));  //first group,0, is the whole string , 1 is the subgroup
     for(i=0;i<n;i++)
                System.out.println("\n");   
     }
Freiheit
  • 8,408
  • 6
  • 59
  • 101
Ovidiu Buligan
  • 2,784
  • 1
  • 28
  • 37
  • Accepted because of the general pattern for dealing with the problem. I had to use the regex from Dave DiFranco. Edited original post with sol'n – Freiheit Mar 17 '11 at 17:07
1

You have to escape the backslashes so the compiler ignores them but the regex engine sees them.

Backslashes within string literals in Java source code are interpreted as required by the Java Language Specification as either Unicode escapes or other character escapes. It is therefore necessary to double backslashes in string literals that represent regular expressions to protect them from interpretation by the Java bytecode compiler.

http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html

The replaceAll() part will not do what you want (repeating the replacement a number of times) because there are no provisions for that in replacement text patterns. You'll have to capture the integer with .group(1), convert it to an integer with Integer.valueOf(), and then repeat the replacement text that number of times.

Community
  • 1
  • 1
Apalala
  • 9,017
  • 3
  • 30
  • 48
0

You can't use regex that way.

Instead, you should map the (\d+) number you match to a loop doing replacement of .sp(\d)+). I never saw this type of replacement, using dynamic construct, and BTW the regex engine would have to type the matching group to ensure that this is a number and not a string.

So I suggest to retrieve the number and use it to construct the replacement pattern \n\n...\n depending on the number. Then you can replace.

M'vy
  • 5,696
  • 2
  • 30
  • 43