1

I want to split a camelCase name to individual names using regex, for performing a spell check.

The split should be as follows:

1) extendedStructureForNUB --> extended, Structure, For, NUB

2) extendedStructureFor2004 --> extended, Structure, For, 2004

Using the answer from the below question , i am able to split for the 1st condition.

Question : RegEx to split camelCase or TitleCase (advanced)

But for a string containing number (2nd condition), it is not returning as per format.

extendedStrctureFor2004 --> extended, Structure, For2004

Please suggest a way by which i can reuse this regex to split numerals also.

Community
  • 1
  • 1
Unni Kris
  • 3,081
  • 4
  • 35
  • 57

4 Answers4

4
public static void main(String[] args) 
{     
    for (String w : "camelValue".split("(?<!(^|[A-Z0-9]))(?=[A-Z0-9])|(?<!^)(?=[A-Z][a-z])")) {
         System.out.println(w);
    } 
}

Edit: Correcting the case for UPPER2000UPPER the regex becomes:

public static void main(String[] args) 
{     
    for (String w : "camelValue".split("(?<!(^|[A-Z0-9]))(?=[A-Z0-9])|(?<!(^|[^A-Z]))(?=[0-9])|(?<!(^|[^0-9]))(?=[A-Za-z])|(?<!^)(?=[A-Z][a-z])")) {
         System.out.println(w);
    } 
}
Daren Schwenke
  • 5,428
  • 3
  • 29
  • 34
1
public static void main(String[] args)
{
    for (String w : "extended2004FeeStructure".split("(?<!(^|[A-Z0-9]))(?=[A-Z0-9])|(?<!^)(?=[A-Z][a-z])")) {
        System.out.println(w);
    }
}

corrected one

Vitaly Dyatlov
  • 1,872
  • 14
  • 24
1

What I see is answer from your previous question was almost pervect. If I ware you i would just add another split opperation, but this time before first digit in middle in each word.

Here is example:

String data="2Hello2000WORLDHello2000WORLD";
//your previois split
String[] myFirstSplit=data.split("(?<!(^|[A-Z]))(?=[A-Z])|(?<!^)(?=[A-Z][a-z])");

//I will store split results in list (I don't know size of array)
List<String> list=new ArrayList<>();
for (String s:myFirstSplit){
    //if splited word contains digit after letter then split
    for (String tmp:s.split("(?<=[a-zA-Z])(?=[0-9])"))
        list.add(tmp);
}
System.out.println(list);
//out [2, Hello, 2000, WORLD, Hello, 2000, WORLD]
Pshemo
  • 122,468
  • 25
  • 185
  • 269
  • This will work for in my case, but i am not sure how much performance efficient it will be. I need to compare all the variable names in a code base (which may easily contain thousands or lakhs of variable names) against a spell check dictionary. – Unni Kris May 31 '12 at 10:04
0

After you seperate this

extendedStrctureFor2004 --> extended, Structure, For2004

Store it in some array like "arr"

Use this Regex

var numberPattern = /[0-9]+/g; var numMatch= arr[i].match(numberPattern);

now numMatch will contain the numerals u want..

Kabilan S
  • 1,104
  • 4
  • 15
  • 31