5

Looking for regex solution for following scenaio:

I have strings, which i have to split on the upper case basis but consecutive uppercase parts should not get split.

For example : if the input is

DisclosureOfComparativeInformation

O/p should be

Disclosure Of Comparative Information

But consecutive uppercase should not get split.

GAAP should not result in G A A P.

How to find the particular pattern and insert space?

Thanx

Phil
  • 3,934
  • 12
  • 38
  • 62
Deepa
  • 53
  • 1
  • 3

7 Answers7

9

Try -

var subjectString = "DisclosureOfComparativeInformation";
var resultString = Regex.Replace(subjectString, "([a-z])([A-Z])", "$1 $2");
ipr101
  • 24,096
  • 8
  • 59
  • 61
1

Try this regex:

[a-z](?=[A-Z])

With this call to replace:

regex.Replace(toMatch, "$& ")

For more information on the special replacement symbol "$&", see http://msdn.microsoft.com/en-us/library/ewy2t5e0.aspx#EntireMatch

WiseGuyEh
  • 18,584
  • 1
  • 20
  • 20
1

[A-Z]{1}[a-z]+

will split as follows if replaced with match + space

DisclosureOfComparativeInformation -> Disclosure Of Comparative Information

GAPS -> GAPS

SOmething -> SOmething This one may be undesirable

alllower -> alllower

smitec
  • 3,049
  • 1
  • 16
  • 12
1
((?<=[a-z])[A-Z]|[A-Z](?=[a-z]))

replace with

" $1"

In a second step you'd have to trim the string.

check out this link also......

Regular expression, split string by capital letter but ignore TLA

Community
  • 1
  • 1
sikender
  • 5,883
  • 7
  • 42
  • 80
0

Using regex solutions to look for strings where something is not true tends to become unrecognizable. I'd recommend you go through your string in a loop and split it accordingly without using regexp.

Ingo
  • 36,037
  • 5
  • 53
  • 100
0

In Perl this should work:

str =~ s/([A-Z][a-z])/ \1/g;

The parenthesis around the two character sets save the match for the "\1" (number one) later.

jcadcell
  • 791
  • 1
  • 8
  • 19
  • 1
    wont this only match the first two chars? – smitec Sep 22 '11 at 12:49
  • @smitec Thanks for catching that. I have added the 'g' modifier to the end which replaces all occurrences of the pattern (in Perl). Also, this solution would need to be trimmed, as in sikender's answer. – jcadcell Sep 22 '11 at 13:02
0

Split and Join:

string.Join(" ", Regex.Split("DisclosureOfComparativeInformation", @"([A-Z][a-z]*)"))
onof
  • 17,167
  • 7
  • 49
  • 85