0

How to remove the part with "_" and numbers connected together in a string using Python?

For example,

Input: ['apple_3428','red_458','D30','green']

Excepted output: ['apple','red','D30','green']

Thanks!

Leo
  • 65
  • 2
  • 10
  • 1
    what will happen to ['green_', 'green_458aaa']? – eroot163pi Jul 20 '21 at 11:41
  • My samples don't contain such cases so I don't consider for these two cases you mentioned. For sure it would be better if they can be considered. In this case the output for them should be ['green_','greenaaa']. In other words, remove the part where "_" and numbers both connected together. :) – Leo Jul 20 '21 at 11:48
  • Got it added possible solutions for few cases – eroot163pi Jul 20 '21 at 11:55

5 Answers5

2

This should work:

my_list = ['apple_3428','red_458','D30','green']
new_list = []
for el in my_list:
    new_list.append(el.split('_')[0])

new_list will be ['apple', 'red', 'D30', 'green'].

Basically you split every element of my_list (which are supposed to be strings) and then you take the first, i.e. the part before the _. If _ is not present, the string will not be split.

SilentCloud
  • 1,677
  • 3
  • 9
  • 28
2

Using regular expressions with re.sub:

import re

[re.sub("_\d+$", "", x) for x in ['apple_3428','red_458','D30','green']]
# ['apple_3428','red_458','D30','green']

This will strip an underscore followed by only digits from the end of a string.

user2390182
  • 72,016
  • 6
  • 67
  • 89
1

Try this:

output_list = [x.split('_')[0] for x in input_list]
Riccardo Bucco
  • 13,980
  • 4
  • 22
  • 50
1

I am not sure which is needed, so present few options

Also list comp is better instead of map + lambda, also list comp is more pythonic, List comprehension vs map

  1. \d+ stand for atleast one digit
  2. \d* stand for >= 0 digit
>>> import re
>>> list(map(lambda x: re.sub('_\d+$', '', x), ['green_', 'green_458aaa']))
['green', 'greenaaa']
>>> list(map(lambda x: re.sub('_\d*', '', x), ['green_', 'green_458aaa']))
['green', 'greenaaa']
>>> list(map(lambda x: re.sub('_\d+', '', x), ['green_', 'green_458aaa']))
['green_', 'greenaaa']
>>> list(map(lambda x: x.split('_', 1)[0], ['green_', 'green_458aaa']))
['green', 'green']
eroot163pi
  • 1,791
  • 1
  • 11
  • 23
0
input_list = ['apple_3428','red_458','D30','green']
output_list = []

for i in input_list:
    output_list.append(i.split('_', 1)[0])

You can simply split the string.

CozyCode
  • 484
  • 4
  • 13