2

I am trying to get rid of special characters inside a list :

file_stuff
['John Smith\n', '\n', 'Gardener\n', '\n', 'Age 27\n', '\n', 'Englishman']

file_stuff_new = [x for x in file_stuff if x != '\n']
file_stuff_new = [x.replace('\n', '') for x in file_stuff_new]
file_stuff_new

['John Smith', 'Gardener', 'Age 27', 'Englishman']

This apparently works. Any other suggestions?

user8270077
  • 4,621
  • 17
  • 75
  • 140
  • 1
    Note that a list comprehension will *generate* a list, not modify one. Are you storing the result of the list comprehension somewhere and looking at that? – jedwards Jun 08 '18 at 09:33
  • Possible duplicate of [How can I remove (chomp) a trailing newline in Python?](https://stackoverflow.com/questions/275018/how-can-i-remove-chomp-a-trailing-newline-in-python) – Vikas Periyadath Jun 08 '18 at 09:41
  • Please update your question with the output you are expecting. – quamrana Jun 08 '18 at 09:42

4 Answers4

1

you could use strip(), as:

file_stuff = map(lambda s: s.strip(), file_stuff)
print(file_stuff)
// ['John Smith', '', 'Gardener', '', 'Age 27', '', 'Englishman']

use filter if you want to remove empty items from list, like

file_stuff = filter(None, map(lambda s: s.strip(), file_stuff))
Sudhir Bastakoti
  • 99,167
  • 15
  • 158
  • 162
1

You are using a raw string literal.

r'\n' is not the newline character, it's a string of length two containing the characters "\" and "n".

>>> r'\n'
'\\n'
>>> len(r'\n')
2

Otherwise, your original approach works (almost) fine.

>>> file_stuff = ['John Smith\n', '\n', 'Gardener\n', '\n', 'Age 27\n', '\n', 'Englishman']
>>> [x.replace('\n', '') for x in file_stuff]
['John Smith', '', 'Gardener', '', 'Age 27', '', 'Englishman']

We can filter out the empty strings like this:

>>> file_stuff = ['John Smith\n', '\n', 'Gardener\n', '\n', 'Age 27\n', '\n', 'Englishman']
>>> no_newline = (x.replace('\n', '') for x in file_stuff)
>>> result = [x for x in no_newline if x]
>>> result
['John Smith', 'Gardener', 'Age 27', 'Englishman']

where no_newline is a memory efficient generator that does not build an intermediary temporary list.

If you just want to strip whitespace and newline-characters from the beginning and end of your strings, consider the str.strip method.

>>> file_stuff = ['John Smith\n', '\n', 'Gardener\n', '\n', 'Age 27\n', '\n', 'Englishman']
>>> no_newline = (x.strip() for x in file_stuff)
>>> result = [x for x in no_newline if x]
>>> result
['John Smith', 'Gardener', 'Age 27', 'Englishman']

This could be shortened to

>>> result = [x.strip() for x in file_stuff if x.strip()]
>>> result
['John Smith', 'Gardener', 'Age 27', 'Englishman']

if you can deal with the inelegancy of calling str.strip twice per string.

timgeb
  • 76,762
  • 20
  • 123
  • 145
  • I guess that instead of calling `strip()` twice you could use a regular expression to remove what you need to be removed. Not sure about the performance though. – ChatterOne Jun 08 '18 at 09:53
  • @ChatterOne you could also use `result = [y for x in file_stuff for y in [x.strip()] if y]` but that just makes things worse. I included the last one more for educational reasons, to illustrate why I think that creating the generator is a good approach. I think it's certainly more readable than map/filter chains. Regex might be a bit overkill for this task, but will certainly work. – timgeb Jun 08 '18 at 09:55
0

You may try mapping your list to a function like replace:

file_stuff = map(lambda x: x.replace("\n", ""), file_stuff)
DFE
  • 126
  • 9
0

This example is simple list-comprehension with condition :

>>> stuff = ['John Smith\n', '\n', 'Gardener\n', '\n', 'Age 27\n', '\n', 'Englishman']
>>> pure = [i.strip() for i in stuff if i.strip()]
>>> print(pure)
['John Smith', 'Gardener', 'Age 27', 'Englishman']
Bijoy
  • 1,131
  • 1
  • 12
  • 23