15

I'm new to Python and I am trying to replace all uppercase-letters within a word to underscores, for example:

ThisIsAGoodExample

should become

this_is_a_good_example

Any ideas/tips/links/tutorials on how to achieve this?

Davor Lucic
  • 28,970
  • 8
  • 66
  • 76
TiGer
  • 5,879
  • 6
  • 35
  • 36
  • 3
    http://stackoverflow.com/questions/1175208/does-the-python-standard-library-have-function-to-convert-camelcase-to-camel-case – Josh Lee Sep 06 '11 at 15:15
  • The example you give matches neither the title nor the description of this question. Are you trying to replace all uppercase characters with underscores or are you trying to convert CamelCase to lowercase_underscore_separated? You'll find that unless you can explain in words what it is you want to do, solving it in Python (or any other language) is going to be too challenging. – johnsyweb Sep 13 '11 at 21:44

7 Answers7

18

Here's a regex way:

import re
example = "ThisIsAGoodExample"
print re.sub( '(?<!^)(?=[A-Z])', '_', example ).lower()

This is saying, "Find points in the string that aren't preceeded by a start of line, and are followed by an uppercase character, and substitute an underscore. Then we lower()case the whole thing.

Kris Jenkins
  • 4,083
  • 30
  • 40
  • 3
    That misses a lot of uppercase letters. It wouldn't handle one spelling of my name, "Éric", for example. IIRC, `\p{Lu}` is the appropriate pattern, not `[A-Z]`. – ikegami Sep 06 '11 at 17:17
12
import re
"_".join(l.lower() for l in re.findall('[A-Z][^A-Z]*', 'ThisIsAGoodExample'))

EDIT: Actually, this only works, if the first letter is uppercase. Otherwise this (taken from here) does the right thing:

def convert(name):
    s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
    return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()
Community
  • 1
  • 1
xubuntix
  • 2,333
  • 18
  • 19
5

This generates a list of items, where each item is "_" followed by the lowercased letter if the character was originally an uppercase letter, or the character itself if it wasn't. Then it joins them together into a string and removes any leading underscores that might have been added by the process:

print ''.join('_' + char.lower() if char.isupper() else char
              for char in inputstring).lstrip('_')

BTW, you haven't specified what to do with underscores that are already present in the string. I wasn't sure how to handle that case so I punted.

Kirk Strauser
  • 30,189
  • 5
  • 49
  • 65
2
example = 'ThisIsAGoodExample'
# Don't put an underscore before first character.
new_example = example[0].lower()
for character in example[1:]:
    # Append an underscore if the character is uppercase.
    if character.isupper():
        new_example += '_'
    new_example += character.lower()
Tyler Crompton
  • 12,284
  • 14
  • 65
  • 94
2

As no-one else has offered a solution using a generator, here's one:

>>> sample = "ThisIsAGoodExample"
>>> def upperSplit(data):
...   buff = ''
...   for item in data:
...     if item.isupper():
...       if buff:
...         yield buff
...         buff = ''
...     buff += item
...   yield buff
...
>>> list(upperSplit(sample))
['This', 'Is', 'A', 'Good', 'Example']
>>> "_".join(upperSplit(sample)).lower()
'this_is_a_good_example'
MattH
  • 37,273
  • 11
  • 82
  • 84
0

An attempt at a readable version:

import re

_uppercase_part = re.compile('[A-Z][^A-Z]*')    

def uppercase_to_underscore(name):
    result = ''
    for match in _uppercase_part.finditer(name):
        if match.span()[0] > 0:
            result += '_'
        result += match.group().lower()
    return result
yeoman
  • 1,671
  • 12
  • 15
0

Parse your string, each time you encounter an upper case letter, insert an _ before it and then switch the found character to lower case

KevinDTimm
  • 14,226
  • 3
  • 42
  • 60