1

I managed to cobble together a python function using regular expressions to convert camel to snake case and it works for all my test cases, yet I still have a couple questions.

1) What is each of the three statements actually doing?

import re

test_cases = list()
test_cases.append('camelCase')
test_cases.append('camelCaseCase')
test_cases.append('camel2Case')
test_cases.append('camel12Case')
test_cases.append('camel12Case')
test_cases.append('camelCaseURL')
test_cases.append('camel2CaseURL')
test_cases.append('camel12CaseURL')
test_cases.append('camel12Case2URL')
test_cases.append('camel12Case12URL')
test_cases.append('CamelCase')
test_cases.append('CamelCaseCase')
test_cases.append('URLCamelCase')


def camel_to_snake(string):
    string = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', string)
    string = re.sub('(.)([0-9]+)', r'\1_\2', string)
    return re.sub('([a-z0-9])([A-Z])', r'\1_\2', string).lower()


for string in test_cases:
    print(string + ' -> ' + camel_to_snake(string))

Which results in:

camelCase -> camel_case
camelCaseCase -> camel_case_case
camel2Case -> camel_2_case
camel12Case -> camel_12_case
camel12Case -> camel_12_case
camelCaseURL -> camel_case_url
camel2CaseURL -> camel_2_case_url
camel12CaseURL -> camel_12_case_url
camel12Case2URL -> camel_12_case_2_url
camel12Case12URL -> camel_12_case_12_url
CamelCase -> camel_case
CamelCaseCase -> camel_case_case
URLCamelCase -> url_camel_case
user1507844
  • 5,973
  • 10
  • 38
  • 55
  • There's always a simpler way than regex, but I think the question you want to ask is "Is there a better way," provided you have a definition for "better." That would be a good question for [code review](http://codereview.stackexchange.com/) – wnnmaw Jan 16 '14 at 18:24
  • 2
    You mean you don't actually know what code you wrote is doing? – Martijn Pieters Jan 16 '14 at 18:24
  • 1
    For alternate approaches, see [this famous question](http://stackoverflow.com/questions/1175208/elegant-python-function-to-convert-camelcase-to-camel-case). – DSM Jan 16 '14 at 18:27
  • @DSM: I'd say this is a duplicate of that one, actually. – Martijn Pieters Jan 16 '14 at 18:34
  • @MartijnPieters, yes I don't actually know what the code I wrote is doing. I just messed around with it until I got it to work. Which is why I'm trying to learn what it is actually doing. It isn't quite a dupe of the question you reference b/c I want numbers between words to be separated by underscores. I took a good chunk of the code from the question you reference though. – user1507844 Jan 16 '14 at 18:41
  • I'm reviewing this in the Close Votes queue. I agree that it's not a dupe, because the linked question asks how to convert, and in this one you have code that does the conversion and you're asking what it does. But what exactly do you mean when you say you don't know what it's doing? You know what results you get for various input. If you mean you don't understand how it works, can you be more specific about what you don't understand? I'd vote to leave open on the question of whether it's a dupe, but as it is it could still be closed as "unclear what you're asking" or "too broad". – Adi Inbar Jan 16 '14 at 23:47
  • I don't know exactly what each statement is accomplishing which is why I accepted the answer from @F.J which explains that. – user1507844 Jan 17 '14 at 03:43

1 Answers1

9

To answer your second question first, this seems like a perfectly reasonable way to accomplish this task but it might not be as maintainable as other approaches since it can be kind of difficult to figure out how it works.

Here is a breakdown of what each line does:

  • string = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', string)
    Adds an underscore immediately before every single uppercase character that is followed by one or more lowercase character, except at the beginning of the string.

  • string = re.sub('(.)([0-9]+)', r'\1_\2', string)
    Adds an underscore immediately before any group of consecutive digits, except at the beginning of the string.

  • return re.sub('([a-z0-9])([A-Z])', r'\1_\2', string).lower()
    Adds an underscore immediately before any uppercase character that has a lowercase character or digit before it, and converts the whole string to lowercase and returns.

Andrew Clark
  • 202,379
  • 35
  • 273
  • 306