0

I have a list of lists (irregular lengths of sub-lists), on which I want to perform re operations but can't get it to work. I'm sure I'm missing something profane; could someone point out what I'm doing wrong?

Consider the following code snippets:

test_list = [ # sample list of lists in which I want to replace the "\n"
      ["test\n\n\n\n\n\n\n\n", "another test\n", "spam"],
      ["egg\n\n", "house"],
      ["\n\nabc", "def\n", "\n\n\nghi", "jklm\n\n", "nop(e)", "\nqrst\n"],
      ["uvw\n", "\n\nx", "yz\n\n"]]
for item in test_list:
    for subitem in item:
    re.sub('\n', '___', subitem)
pprint.pprint(test_list)

Output:

[['test\n\n\n\n\n\n\n\n', 'another test\n', 'spam'],
 ['egg\n\n', 'house'],
 ['\n\nabc', 'def\n', '\n\n\nghi', 'jklm\n\n', 'nop(e)', '\nqrst\n'],
 ['uvw\n', '\n\nx', 'yz\n\n']]

(The output is unchanged - the replacement didn't work.)

Thanks in advance for the help.

Edit:

Thanks Wiktor Stribiżew for the link. The first advice from the referenced question - string being immutable! - was helpful but I cannot get it to work for the list of lists.

Following the advice from here and here, I my code looks like this:

newtestlist = [[re.sub("\n", '_', item) for subitem in item] for item in testlist]

However, it doesn't work (throwing a TypeError: expected string or bytes-like object - I'm not referring correctly to the subitems of my list.) Can someone point me in the right direction? Many thanks

Ivo
  • 3,890
  • 5
  • 22
  • 53

1 Answers1

1

For a simple list of list, your edited solution should have worked, but you have to change re.sub("\n", '_', item) to re.sub("\n", '_', subitem) as @Mark Meyer noted. I noticed a typo as well; testlist instead of test_list. Here's what I tested and worked with your test_list

[[re.sub(r'\n', r'_', item) for item in sub_list] for sub_list in test_list]

But if you have a deeply nested list, I think you'll need a recursive function.

def sub_nested(l, regin, regout):
    """Recursive function to do string replace in a nested list"""
    retlist = []
    for item in l:
        if isinstance(item, list):
            retlist.append(sub_nested(item, regin, regout))
        else:
            retlist.append(re.sub(regin, regout, item))
    return retlist

Testing it on your input list.

sub_nested(test_list, r'\n', r'___')

Out: 
 [['test________________________', 'another test___', 'spam'],
 ['egg______', 'house'],
 ['______abc', 'def___', '_________ghi', 'jklm______', 'nop(e)', '___qrst___'],
 ['uvw___', '______x', 'yz______']]
najeem
  • 1,841
  • 13
  • 29