1

I have this list:

l=['abcdef', 'abcdt', 'neft', 'ryr', 'yyyyy', 'u', 'aaaaaaaaaa']

and, the length of each elements in the above list is 6,5,4,3,5,1,10 respectively.

I wish to combine it in a way such that it satisfies a condition: Each element in the newly created list should be at least of length 10, that means, consider the next elements for combining until the desire length is reached. Spaces will be added at every joining point.

Thus, the list now becomes:

l=['abcdef abcdt', 'neft ryr yyyyy', 'u aaaaaaaaaa']

I tried combining it up based on iterations and other ways but nothing seems to work. Any suggestions?

MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • 4
    Post the code that you've tried. Also, it looks like `l` is a list of strings, but your examples don't include quotes. Please add quotes to make it clear what the elements of `l` are. – Craig Apr 10 '17 at 00:58
  • 2
    Just do `l = [' '.join(l)]`. That satisfies the condition (if there's any way to satisfy it at all). – Stefan Pochmann Apr 10 '17 at 01:03
  • 1
    What would the result be if the list didn't contain that `'aaaaaaaaaa'`? Would the `'u'` get ignored? – Stefan Pochmann Apr 10 '17 at 01:30
  • well, the last 'u' would be simply added up, rather than ignoring, in case 'aaaaaaaaaa' wasn't there. – Karmesh Maheshwari Apr 10 '17 at 01:34

3 Answers3

4

You could use a generator that takes items from the iterable as long as the length requirement isn't fulfilled:

def join_while_too_short(it, length):
    it = iter(it)
    while True:
        current = next(it)
        while len(current) < length:
            current += ' ' + next(it)
        yield current

When running this on your input it produces the correct result:

>>> l = ['abcdef', 'abcdt', 'neft', 'ryr', 'yyyyy', 'u', 'aaaaaaaaaa']
>>> list(join_while_too_short(l, 10))
['abcdef abcdt', 'neft ryr yyyyy', 'u aaaaaaaaaa']

It won't be really efficient because it constantly formats the strings, you could also collect them as list and join them before yielding, but this version should be clearer about the principle.


Note that the requirement may not always be fulfilled because there might not be enough items at the end to create a string of the desired length. However you said that you want to "consider the next elements for combining until the desire length is reached". And the presented approach does exactly that.

MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • If you want clear *and* efficient, you could use `nxt += ' ' + next(it)`. – Stefan Pochmann Apr 10 '17 at 01:10
  • @StefanPochmann You're probably right. I also liked you `[' '.join(l)]` suggestion. :) – MSeifert Apr 10 '17 at 01:13
  • 1
    Your edited code doesn't work. And `current = current + ' ' + next(it)` doesn't benefit from the optimization that `current += ' ' + next(it)` has. Try both for example with `timeit(lambda: list(join_while_too_short(['a'] * 800000, 800000)), number=1)`. I get 62 seconds for your version and 0.3 seconds for mine. – Stefan Pochmann Apr 10 '17 at 01:21
  • @StefanPochmann Thanks for spotting that, I accidentally copied an old version to the answer. :-( I corrected it. – MSeifert Apr 10 '17 at 01:25
  • 1
    Just in case someone is wondering: [some string concatenation operations are optimized](http://stackoverflow.com/a/1350289/1672429). – Stefan Pochmann Apr 10 '17 at 01:28
1

Just running through once and appending until you reach the condition should work fine. As far as I know you can't list comprehend your way into multi row operations for a list, but if Pandas is an option let us know by editing your question. Pandas Dataframe shift method will let you examine multiple rows at a time in a lambda function and solve this problem in a non for loop way.

l=[ 'abcdef', 'abcdt', 'neft', 'ryr', 'yyyyy', 'u', 'aaaaaaaaaa' ]

newlist = list()
newitem = ''
for item in l:
    if len(newitem) == 0:
        newitem = item
    else:
        newitem = newitem +" "+ item
    if len(newitem) > 9:
        newlist.append(newitem)
        newitem=''

if len(newitem)>0: # grab any left over stuff that was <10 digits at the end
    newlist.append(newitem)

print (newlist)

the output from jupyter running Python 3.6 is as you expect

['abcdef abcdt', 'neft ryr yyyyy', 'u aaaaaaaaaa']

Semicolons and Duct Tape
  • 2,927
  • 4
  • 20
  • 35
0

This should work, note that the list is modified, if you don't want that, make a copy before. Tested with the case you gave.

def combine(a, n):
    i = 0
    while i < len(a):
        if len(a[i]) >= n:
            i += 1
        elif i + 1 < len(a):
            a[i:i + 2] = [a[i] + " " + a[i + 1]]
        elif len(a) > 1:
            a[i - 1:i + 1] = [a[i - 1] + " " + a[i]]
            break
        else:
            break
Santos
  • 194
  • 2
  • 4