Concatenating characters in nested lists

Question

I'm currently working with a datastructure that presents itself like this:

['t','h','i','s',' ','i','s',' ','q','u','e','r','y',' ','i','t','e','m',' ','1','t','h','i','s',' ','i','s',' ','q','u','e','r','y',' ','i','t','e','m',' ','2', ['t','h','i','s',' ','i','s',' ','a',' ','s','u','b','q','u','e','r','y'], 't','h','i','s',' ','i','s',' ','q','u','e','r','y',' ','i','t','e','m',' ','3']

I got this dataset from parsing a query string using this answer from SO: https://stackoverflow.com/a/17141441

The query I parsed was:

(this is query item 1 this is query item 2(this is a subquery)this is query item 3)

The problem is that it deals with individual characters which are appended to the list one by one. I need to get back to a structure like:

['this is query item 1 this is query item 2', ['this is a subquery'], 'this is query item 3']

I'm trying to wrap my head around the parser function to do this or do a post-process step to push the characters back together. Anyone know of a solution for this?

For some reason this was marked as duplicate but it isn't, the answer you marked as being duplicate doesn't deal with nested lists. The answer given by @daniel-mesejo does which was what I was searching for. — Yonathan, Dec 05 '18 at 00:56
Instead of doing damage control, I would recommend changing the code to return strings in the first place. — cs95, Dec 05 '18 at 01:12

Dani Mesejo · Accepted Answer · 2018-12-05T00:53:29.980

As a post-process step you could use itertools.groupby in a recursive function:

from itertools import groupby

data = ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 'q', 'u', 'e', 'r', 'y', ' ', 'i', 't', 'e', 'm', ' ', '1', 't', 'h',
        'i', 's',
        ' ', 'i', 's', ' ', 'q', 'u', 'e', 'r', 'y', ' ', 'i', 't', 'e', 'm', ' ', '2',
        ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', ' ', 's', 'u', 'b', 'q', 'u', 'e', 'r', 'y'], 't', 'h', 'i', 's',
        ' ', 'i', 's', ' ', 'q', 'u', 'e', 'r', 'y', ' ', 'i', 't', 'e', 'm', ' ', '3']


def join(lst):
    for is_list, group in groupby(lst, key=lambda x: isinstance(x, list)):
        if is_list:
            yield from (list(join(value)) for value in group)
        else:
            yield ''.join(group)


result = list(join(data))
print(result)

Output

['this is query item 1this is query item 2', ['this is a subquery'], 'this is query item 3']

This will create groups for lists and characters, if the group is one of characters use the built-in join function, else call the join function recursively.

Perfect, this works exactly as intended and easily understandable code as well. Thank you for this! — Yonathan, Dec 05 '18 at 00:52

Concatenating characters in nested lists

1 Answers1