0

If I have a string: String string = "Hi my name is "Bob Peters""

I want to split the string by whitespaces only if it's not surrounded by quotes. But also, I don't want to include the quotes in my final result.

So the end result would be

Hi, my, name, is, Bob Peters

Where the name is together and the rest are split up.

In groovy, here is what I have so far:

def text = "Hi my name is 'Bob Peters'"
def newText = text.split(/\s(?=(?:[^'"`]*(['"`])[^'"`]*\1)*[^'"`]*$)/);
println(newText)

this results in

Hi
my
name
is
`Bob Peters`

But I need to be able to remove the single/double quotes surrounding Bob Peters too

Alex Len
  • 43
  • 3
  • How is this different to [Groovy string split regular expression is not working properly](https://stackoverflow.com/questions/66308474/groovy-string-split-regular-expression-is-not-working-properly) – cfrick Feb 22 '21 at 06:44

1 Answers1

0

The simplest options is to have an additional step to remove the quotes, as below

​def text = "'Hi there' my name is 'Bob Peters' 'Additional quotes'"
def newText = text.split(/\s(?=(?:[^'"`]*(['"`])[^'"`]*\1)*[^'"`]*$)/);
print(newText.collect {
    it.replaceAll(/^['"`](.*)['"`]$/,'$1');
})

It would print [Hi there, my, name, is, Bob Peters, Additional quotes]


Alternatively, we can consider a space optionally preceded and followed by quotes ['"`] as split pattern.

But this will not remove quotes at the start and end of the string. We need to include alternate split pattern to include quotes at the start and end of string.

So the pattern becomes

^['"`]|['"`]$|['"`]?<<< Your Existing Pattern >>>['"`]?. 

There is one another issue with this approach. If quotes appear at the start of the string like 'Hi there ...```, then an empty string will be prepended in the output.

So we will include a space at the beginning of the string and always ignore the first element in the result array. The final patter will be

^\s['"`]|['"`]$|['"`]?<<<Your Existing Pattern>>>['"`]?

Groovy Code:

def text = "Hi there my name is 'Bob Peters' 'Additional quotes'"
def newText = (" " + text).split(/^\s['"`]|['"`]$|['"`]?\s(?=(?:[^'"`]*(['"`])[^'"`]*\1)*[^'"`]*$)['"`]?/);
print(newText[1..newText.size()-1])​

Will print [Hi, there, my, name, is, Bob Peters, Additional quotes]


Note: The positive lookahead, \s(?=(?:[^'"`]*(['"`])[^'"`]*\1)*[^'"`]*$) will not handle nested quotes. Example

Hi there `Outer quotes 'inner quotes'` 

will not be split.

Prasanna
  • 2,390
  • 10
  • 11