2

I need to replace all underscores in a string except those that fall within the bounds of two apostrophes. For instance:

"first_name" => "first name"
"code_numbers = '123_456'" => "code numbers = '123_456'"

I am currently just throwing away all underscores using .replaceAll("_", " "), as they are not extremely common, but I'm wanting to touch all bases now just in case.

user2303325
  • 766
  • 6
  • 18

2 Answers2

4

This should work (this regex replaces all the _ followed by an even number of single-quotes). Of course, this requires your quotes to be balanced:

String str = "\"code_numbers = '123_456'\"";

str = str.replaceAll("(?x) " + 
               "_          " +   // Replace _
               "(?=        " +   // Followed by
               "  (?:      " +   // Start a non-capture group
               "    [^']*  " +   // 0 or more non-single quote characters
               "    '      " +   // 1 single quote
               "    [^']*  " +   // 0 or more non-single quote characters
               "    '      " +   // 1 single quote
               "  )*       " +   // 0 or more repetition of non-capture group (multiple of 2 quotes will be even)
               "  [^']*    " +   // Finally 0 or more non-single quotes
               "  $        " +   // Till the end  (This is necessary, else every _ will satisfy the condition)
               ")          " ,   // End look-ahead
                       "");      // Replace with ""
Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
1

Resurrecting this question because it had a simple regex solution that wasn't mentioned. (Found your question while doing some research for a regex bounty quest.)

'[^']*'|(_)

The left side of the alternation matches complete 'single quoted strings'. We will ignore these matches. The right side matches and captures underscores to Group 1, and we know they are the right underscores because they were not matched by the expression on the left.

Here is working code (see online demo):

import java.util.*;
import java.io.*;
import java.util.regex.*;
import java.util.List;

class Program {
public static void main (String[] args) throws java.lang.Exception  {

String subject = "code_numbers = '123_456'";
Pattern regex = Pattern.compile("'[^']*'|(_)");
Matcher m = regex.matcher(subject);
StringBuffer b= new StringBuffer();
while (m.find()) {
    if(m.group(1) != null) m.appendReplacement(b, " ");
    else m.appendReplacement(b, m.group(0));
}
m.appendTail(b);
String replaced = b.toString();
System.out.println(replaced);
} // end main
} // end Program

Reference

  1. How to match pattern except in situations s1, s2, s3
  2. How to match a pattern unless...
Community
  • 1
  • 1
zx81
  • 41,100
  • 9
  • 89
  • 105