8

Given a list:

mylist = ['dog', 'cat', 'mouse_bear', 'lion_tiger_rabbit', 'ant']

I'd like a one-liner to return a new list:

['dog', 'cat', 'mouse', 'bear', 'lion', 'tiger', 'rabbit', 'ant']
rjmoggach
  • 1,458
  • 15
  • 27
  • 2
    is this a code golf? one-liner seems an unneccessary restriction. – Stael Aug 02 '17 at 16:03
  • This question really confuses me - I get that trying to write 1-liners is a fun exercise, but I assumed that that sort of thing was 'frowned upon' - this is a low effort question with dubious requirements and I was surprised that everyone was so enthusiastic about it. – Stael Aug 03 '17 at 09:10
  • well the goal was to get a pythonic one liner, not a perl-y, one liner... so readable and semantically clever while also logical and efficient... :P – rjmoggach Aug 04 '17 at 03:34
  • I thought for sure a list comprehension answer would take this for elegance but in the end the simple join and split was fastest and easiest to read - surprised to say the least! – rjmoggach Aug 04 '17 at 03:42
  • Although I love list comps, one-liners are over-rated. I'd do this with traditional loops. – PM 2Ring Aug 28 '18 at 17:22

12 Answers12

14

Another trick is first to join the list with underscores and then re-split it:

"_".join(mylist).split('_')
Eugene Sh.
  • 17,802
  • 8
  • 40
  • 61
6

Just use 2 for clauses in your comprehension, e.g.:

>>> mylist = ['dog', 'cat', 'mouse_bear', 'lion_tiger_rabbit', 'ant']
>>> [animal for word in mylist for animal in word.split('_')]
['dog', 'cat', 'mouse', 'bear', 'lion', 'tiger', 'rabbit', 'ant']
AChampion
  • 29,683
  • 4
  • 59
  • 75
4

This is not a one liner, but is nevertheless a valid option to consider if you want to return a generator:

def yield_underscore_split(lst):
     for x in lst:
         yield from x.split('_')

>>> list(yield_underscore_split(mylist))
['dog', 'cat', 'mouse', 'bear', 'lion', 'tiger', 'rabbit', 'ant']

Original answer valid only for versions Python 3.3-3.7, kept here for interested readers. Do not use!

>>> list([(yield from x.split('_')) for x in l]) 
['dog', 'cat', 'mouse', 'bear', 'lion', 'tiger', 'rabbit', 'ant']
cs95
  • 379,657
  • 97
  • 704
  • 746
  • 1
    @hiroprotagonist Ty. If OP is on 3.3 or above I would recommend this as being succinct as hell :) – cs95 Aug 02 '17 at 16:08
  • 1
    This is actually invalid syntax in Python 3.8 and up (a deprecation warning is issued in 3.7), don't use `yield` in comprehensions! See [yield in list comprehensions and generator expressions](//stackoverflow.com/q/32139885) – Martijn Pieters Aug 28 '18 at 17:16
3

using the itertools recipe to flatten a list you could do this:

from itertools import chain

mylist = ['dog', 'cat', 'mouse_bear', 'lion_tiger_rabbit', 'ant']

new_list = list(chain.from_iterable(item.split('_') for item in mylist))
print(new_list) 
# ['dog', 'cat', 'mouse', 'bear', 'lion', 'tiger', 'rabbit', 'ant']

...or does the import statement violate your one-liner requirement?

hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
3

Since so many answers here were posted (over ten), I thought it'd be beneficial to show some timing stats to compare the different methods posted:

-----------------------------------------
AChampion time: 2.6322
-----------------------------------------
hiro_protagonist time: 3.1724
-----------------------------------------
Eugene_Sh time: 1.0108
-----------------------------------------
cᴏʟᴅsᴘᴇᴇᴅ time: 3.5386
-----------------------------------------
jdehesa time: 2.9406
-----------------------------------------
mogga time: 3.1645
-----------------------------------------
Ajax1234 time: 2.4659
-----------------------------------------

Here's the script I used to test:

from timeit import timeit

setup = """
from itertools import chain
mylist = ['dog', 'cat', 'mouse_bear', 'lion_tiger_rabbit', 'ant']
"""

methods = {
    'AChampion': """[animal for word in mylist for animal in word.split('_')]""",
    'hiro_protagonist': """list(chain.from_iterable(item.split('_') for item in mylist))""",
    'Eugene_Sh': """'_'.join(mylist).split('_')""",
    'cᴏʟᴅsᴘᴇᴇᴅ': """list([(yield from x.split('_')) for x in mylist])""",
    'jdehesa': """sum((s.split("_") for s in mylist), [])""",
    'mogga': """[i for sublist in [j.split('_') for j in mylist] for i in sublist]""",
    'Ajax1234': """list(chain(*[[i] if "_" not in i else i.split("_") for i in mylist]))"""
}

print('-----------------------------------------')
for author, method in methods.items():
    print('{} time: {}'.format(author, round(timeit(setup=setup, stmt=method), 4)))
    print('-----------------------------------------')

Each method is tested against the sample list given in the question about one million times. To keep things readable, each timing result was rounded to four decimal places.


Note: If you have a new, unique method that has not been posted here yet, contact me in the comments and I'll try to add a timing for it too.

Christian Dean
  • 22,138
  • 7
  • 54
  • 87
1

Split each item into sublists and flatten them:

[item for sublist in mylist for item in sublist.split("_")]

thaavik
  • 3,257
  • 2
  • 18
  • 25
1

One-liners are over-rated. Here's a solution using a "traditional" for loop.

mylist = ['dog', 'cat', 'mouse_bear', 'lion_tiger_rabbit', 'ant']

out = []
for s in mylist:
    if '_' in s:
        out.extend(s.split('_'))
    else:
        out.append(s)

print(out)

output

['dog', 'cat', 'mouse', 'bear', 'lion', 'tiger', 'rabbit', 'ant']

This also works:

out = []
for s in mylist:
    out.extend(s.split('_'))

It's shorter, but I think the previous version is clearer.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
0

You can do:

mylist = ['dog', 'cat', 'mouse_bear', 'lion_tiger_rabbit', 'ant']
result = sum((s.split("_") for s in mylist), [])
print(result)
>>> ['dog', 'cat', 'mouse', 'bear', 'lion', 'tiger', 'rabbit', 'ant']
jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • 2
    don't use `sum` for lists. performance suffers a lot (https://stackoverflow.com/questions/42593904/could-sum-be-faster-on-lists) – Jean-François Fabre Aug 02 '17 at 16:03
  • @Jean-FrançoisFabre I wasn't aware of that (or I was and then I forgot). I may still prefer `sum` to a nested generator in small cases for clarity (for lack of a `flatten` or `concatenate` function in the standard library), but it's good to know the performance impact. – jdehesa Aug 02 '17 at 16:10
0

This works:

[i for sublist in [j.split('_') for j in mylist] for i in sublist]
rjmoggach
  • 1,458
  • 15
  • 27
0
mylist = ['dog', 'cat', 'mouse_bear', 'lion_tiger_rabbit', 'ant']
animals = [a for item in mylist for a in item.split('_')]
print (animals)
Ivan Sheihets
  • 802
  • 8
  • 17
-1

You can try this:

from itertools import chain

mylist = ['dog', 'cat', 'mouse_bear', 'lion_tiger_rabbit', 'ant']

new_list = list(chain(*[[i] if "_" not in i else i.split("_") for i in mylist]))

Output:

['dog', 'cat', 'mouse', 'bear', 'lion', 'tiger', 'rabbit', 'ant']
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
-1

what I would actually do:

newlist = []

for i in mylist:
    newlist += i.split('_')
Stael
  • 2,619
  • 15
  • 19