0

I have a list with multiple, consecutive empty ('') elements. I wish to remove all but one of the empty elements in a series of consecutive empty elements. Let me illustrate:

I have the following test data:

test_data = ['hello', '', '', '', 'this is it', '', 'the one', '', '', '', '', 'img', 'out there', 'when']

That I wish to clean up to get this:

test_data = ['hello', '', 'this is it', '', 'the one', '', 'img', 'out there', 'when']

Notice that in the above list, there is not more than one empty element between elements in the list.

I wrote the following code, but this leaves two empty elements:

output = []
newline_count = 0
for element in test_data:
    if element.strip() != '': output.append(element)
    else: 
        newline_count += 1
        if newline_count < 2: output.append('')
        else: newline_count = 0
print(output)

Where am I going wrong?

Code Monkey
  • 800
  • 1
  • 9
  • 27

2 Answers2

1

You can use itertools.groupby.

import itertools

a_list  = ['hello', '', '', '', 'this is it', '', 'the one', '', '', '', '', 'img', 'out there', 'when']
res = []

for key, group in itertools.groupby(a_list, lambda x : x==''):
    # check key=='' or not
    # if key!='' extend all values to res
    if not key:
        res.extend(list(group))
    # if key=='' append one '' to res
    else:
        res.append('')
print(res)

Output:

['hello', '', 'this is it', '', 'the one', '', 'img', 'out there', 'when']

Explanation:

for key, group in itertools.groupby(a_list, lambda x : x==''):
    key_and_group = {key : list(group)}
    print(key_and_group)

Output:

{False: ['hello']}
{True: ['', '', '']}
{False: ['this is it']}
{True: ['']}
{False: ['the one']}
{True: ['', '', '', '']}
{False: ['img', 'out there', 'when']}
I'mahdi
  • 23,382
  • 5
  • 22
  • 30
0

You're almost there:

output = []
newline_count = 0
for element in test_data:
    if element == '':
        if newline_count == 0:
            output.append(element)
            newline_count += 1
    else: 
        output.append(element)
        newline_count = 0
print(output)

ItayB
  • 10,377
  • 9
  • 50
  • 77