0

I am trying to determine if a python list item has a single character in it and if so replace it with an uncontracted version. The issue I am facing is that I cannot get it to detect a single letter (X) without it converting say boxer -> bocrossbreeder

list = ['canine', 'dog', 'X', 'boxer', 'XBreed', ' x ']
list_trimmed = [re.sub(r'\040x\040', 'CrossBreed', lst) for lst in list]

works okay for removing the ' x ' but if i try

list_trimmed = [re.sub(r'x', 'CrossBreed', lst) for lst in list]

it creates boCrossBreeder as it detects the x in a word in the list item.

user5067291
  • 440
  • 1
  • 6
  • 16

3 Answers3

1

You can use regex word boundaries to detect if its a single character e.g. \bx\b

See example here: https://regex101.com/r/FbtRnN/2

list = ['canine', 'dog', 'X', 'boxer', 'XBreed', ' x ']
list_trimmed = [re.sub(r'\bx\b', 'CrossBreed', lst) for lst in list]
stackErr
  • 4,130
  • 3
  • 24
  • 48
1

You can use the beginning (^) and end of string ($) operators, like so:

list_strs = ['canine', 'dog', 'X', 'boxer', 'XBreed', ' x ']
list_trimmed = [re.sub(r'^X|x$', 'CrossBreed', lst) for lst in list_strs]

Also, please note that list is a special word in Python, and you should avoid using it as a variable name.

I see another answer mentioning the word boundary operator (\b), but that does not correctly covers all scenarios, as the string 'canine X dog' will have its X replaced, while not being a single character string.

João Amaro
  • 195
  • 1
  • 8
  • 1
    I agree with the `'canine X dog'` part. However, you don't cover the upper `X`. Btw, the way I read the question is **any** single char, though I might be wrong =). – JvdV Feb 14 '20 at 16:10
  • You are right, it's my mistake when writing the answer that I only used a lowercase x. Editing, thanks for noticing and bringing it up. – João Amaro Feb 14 '20 at 16:20
  • Also, re-reading the question again, I am now second guessing myself, but only the OP can really answer without a doubt. – João Amaro Feb 14 '20 at 16:22
0

Try this.

list_trimmed = [re.sub(r'\b[Xx]\b', 'CrossBreed', lst) for lst in list]

\b.\b

Detects if its a single character

[Xx]

To detect capital and small both

Saif Asif
  • 5,516
  • 3
  • 31
  • 48