1

I am rusty on regular expressions and need some help. A js code base inherited is using a mix of camel case and snake casing for things like variables names and object properties.

I am trying to formulate a regular expression I can use that will identify all the camel cased strings, and then be able to replace those strings with snake casing. The part I am struggling with is identifying the camel cased strings under the conditions I have.

Identifying which strings are camel case: In this document, all camel cased strings start off with either a lower case letter, an underscore, or a $, and then will Use a capital Letter at some point later in the string. Examples are: someCamelCasedString & _someCamelCasedString & $someCamelCasedString. The regular expression would need to take into account that some of these strings I am trying to match for may be object properties, so it should be able to identify things like: Foo._someCamelCasedString.bar or Foo[_someCamelCasedString].bar

Paul T
  • 1,455
  • 11
  • 15
  • FYI I've fleshed out the answer to consider some refinements and edge cases. Let us know if it needs to go in a different direction. – zx81 May 08 '14 at 23:22
  • Thanks @zx81, your answer was perfect! Sorry for the delay on the upvote. – Paul T May 10 '14 at 20:19

3 Answers3

3

This identifies all occurrences of "strict" camel case (only letters). Whether they start with _ or $ or foofoo doesn't matter.

[a-z]+[A-Z][a-zA-Z]*

An edge case is cameL Is that proper camel case? I have assumed it is, but we can change that.

See demo

If you want to allow other characters in the string (digits etc) then we can add them in the character classes. So this is a starting point to be refined depending on your requirements.

For instance if you know that you're happy with digits and underscores, you can go with this:

[a-z]\w*?[A-Z]\w*

If you also want to allow dollars in the name (a character that @Jongware says js strings allow) you can go with this:

[a-z][\w$]*[A-Z][\w$]*

Then there is the question of what constitutes the boundary of a valid string, so that we can perhaps devise some anchor (perhaps with sneaky lookaheads, since js doesn't support lookbehinds) in order to avoid false positives.

zx81
  • 41,100
  • 9
  • 89
  • 105
  • 1
    Note: JS does not allow dashes in names, but it does allow underscore, digits, and a couple of unexpected characters (`$` comes to mind -- see http://stackoverflow.com/questions/1661197/valid-characters-for-javascript-variable-names for a full list. I hope the OP is not finding variables in "καμήλαΥπόθεση" :)) – Jongware May 08 '14 at 23:26
  • @Jongware Thanks for your input, changed the "dashes" example to a "dollars" example. :) – zx81 May 08 '14 at 23:30
  • @PaulT Hey Paul is this solved, or are you still having trouble with it? – zx81 May 09 '14 at 21:10
1

Maybe something like this:

/(\w|\$)+([A-Z])\w+/gm

You can play around with it here and see the examples: http://regexr.com/38qkq The site also explains what each piece means in regular expressions.

HJ05
  • 1,358
  • 2
  • 11
  • 23
1
/(?:^|\s|[^\w$])([a-z_$][a-zA-Z]*[A-Z][a-zA-Z]*)/gm

Test http://regex101.com/r/pH1aB7

Wizard of Ogz
  • 12,543
  • 2
  • 41
  • 43