So, i have sections and text in these sections:
[Section1]
Some weired text in section 1
[Section2]
Some text in section 2
Some text
text
And how to get text from one of these sections?
So, i have sections and text in these sections:
[Section1]
Some weired text in section 1
[Section2]
Some text in section 2
Some text
text
And how to get text from one of these sections?
import re
sections = re.split(r'\[Section\d+\]', text)
Then you can get one of section text using list slicing. In your case:
section[1] will give section 1.
As shown, this code generates a dictionary of the lines in each section, in order, indexed by the section names.
It reads through the file line by line. When it recognises a section header it notes the name. As it reads subsequent lines, until it reads the next header, it saves them in sections
, as a list under that name.
If you don't want or need the line-ends then strip them off in the append
statement.
>>> import re
>>> patt = re.compile(r'^\s*\[\s*(section\d+)\s*\]\s*$', re.I)
>>> sections = {}
>>> with open('to_chew.txt') as to_chew:
... while True:
... line = to_chew.readline()
... if line:
... m = patt.match(line)
... if m:
... section_name = m.groups()[0]
... sections[section_name] = []
... else:
... sections[section_name].append(line)
... else:
... break
...
>>> sections
{'Section2': ['Some text in section 2\n', 'Some text\n', 'text'], 'Section1': ['Some weired text in section 1\n', '\n']}
Edit: simplified code.
>>> import re
>>> patt = re.compile(r'^\s*\[\s*(section\d+)\s*\]\s*$', re.I)
>>> sections = defaultdict(list)
>>> with open('to_chew.txt') as to_chew:
... for line in to_chew:
... m = patt.match(line)
... if m:
... section_name = m.groups()[0]
... else:
... sections[section_name].append(line)
...
>>> sections
defaultdict(<class 'list'>, {'Section1': ['Some weired text in section 1\n', '\n'], 'Section2': ['Some text in section 2\n', 'Some text\n', 'text']})
try this,
text="""[Section1]
Some weired text in section 1
[Section2]
Some text in section 2
Some text
text"""
print text.split('\n\n')
>>>['[Section1]\nSome weired text in section 1', '[Section2]\nSome text in section 2\nSome text\ntext']