2

In python I can easily read a file line by line into a set, just be using:

file = open("filename.txt", 'r')
content = set(file)

each of the elements in the set consists of the actual line and also the trailing line-break.

Now I have a string with multiple lines, which I want to compare to the content by using the normal set operations.

Is there any way of transforming a string into a set just the same way, such, that it also contains the line-breaks?


Edit:

The question "In Python, how do I split a string and keep the separators?" deals with a similar problem, but the answer doesn't make it easy to adopt to other use-cases.

import re
content = re.split("(\n)", string)

doesn't have the expected effect.

white_gecko
  • 4,808
  • 4
  • 55
  • 76
  • 2
    Are you sure this is what you want? Would it be easier to write file.read().split('\n') and compare that to your string? – Erik Godard Oct 22 '16 at 22:17
  • Most of the time the wright thing to do is to ignore white space, you can do `set(line.strip() for line in file)` – Paulo Scardine Oct 22 '16 at 22:18
  • Possible duplicate of [In Python, how do I split a string and keep the separators?](http://stackoverflow.com/questions/2136556/in-python-how-do-i-split-a-string-and-keep-the-separators) – xenteros Oct 23 '16 at 19:28
  • @xenteros actually not, because the answers there are not very portable to other use-cases. I've added a statement to that in the bottom part of the question. – white_gecko Oct 23 '16 at 20:13

4 Answers4

7

The str.splitlines() method does exactly what you want if you pass True as the optional keepends parameter. It keeps the newlines on the end of each line, and doesn't add one to the last line if there was no newline at the end of the string.

text = "foo\nbar\nbaz"
lines = text.splitlines(True)
print(lines) # prints ['foo\n', 'bar\n', 'baz']
Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126
Blckknght
  • 100,903
  • 11
  • 120
  • 169
3

Here's a simple generator that does the job:

content = set(e + "\n" for e in s.split("\n"))

This solution adds an additional newline at the end though.

Francisco
  • 10,918
  • 6
  • 34
  • 45
  • 2
    Worse than adding a newline to the last line if it didn't have one already, this code adds an entire extra string with only a newline in it if the last line of the original text *did* have a trailing newline. So if `s` is `"foo\nbar\n\baz\n"`, your code will make `content` into a set equal to `{'foo\n', 'bar\n', 'baz\n', '\n'}`. – Blckknght Oct 22 '16 at 22:47
2

you can also do it the other way round, remove line endings when reading file lines, assuming you open the file with U for universal line endings:

file = open("filename.txt", 'rU')
content = set(line.rstrip('\n') for line in file)
Aprillion
  • 21,510
  • 5
  • 55
  • 89
0

Could this be what you mean?

>>> from io import StringIO
>>> someLines=StringIO('''\
... line1
... line2
... line3
... ''')
>>> content=set(someLines)
>>> content
{'line1\n', 'line2\n', 'line3\n'}
Bill Bell
  • 21,021
  • 5
  • 43
  • 58