Python. How to parse text by section?

Question

So, i have sections and text in these sections:

[Section1]
Some weired text in section 1

[Section2]
Some text in section 2
Some text
text

And how to get text from one of these sections?

Possible duplicate of [Split Strings with Multiple Delimiters?](https://stackoverflow.com/questions/1059559/split-strings-with-multiple-delimiters) — Steffen Winkler, Aug 02 '17 at 10:44
Read about [configparser](https://docs.python.org/3.6/library/configparser.html) — stovfl, Aug 02 '17 at 13:41

Rahul · Accepted Answer · 2017-08-02T14:19:51.480

4

import re
sections = re.split(r'\[Section\d+\]', text)

Then you can get one of section text using list slicing. In your case:

section[1] will give section 1.

edited Aug 02 '17 at 14:19

answered Aug 02 '17 at 10:39

Rahul

10,830
4
53
88

One of these. Not all. – greenbeaf Aug 02 '17 at 13:38
Also consider using [configparser](https://docs.python.org/3.6/library/configparser.html) – Rahul Aug 02 '17 at 14:39

Bill Bell · Answer 2 · 2017-11-12T16:52:40.440

As shown, this code generates a dictionary of the lines in each section, in order, indexed by the section names.

It reads through the file line by line. When it recognises a section header it notes the name. As it reads subsequent lines, until it reads the next header, it saves them in sections, as a list under that name.

If you don't want or need the line-ends then strip them off in the append statement.

>>> import re
>>> patt = re.compile(r'^\s*\[\s*(section\d+)\s*\]\s*$', re.I)
>>> sections = {}
>>> with open('to_chew.txt') as to_chew:
...     while True:
...         line = to_chew.readline()
...         if line:
...             m = patt.match(line)
...             if m:
...                 section_name = m.groups()[0]
...                 sections[section_name] = []
...             else:
...                 sections[section_name].append(line)
...         else:
...             break
...             
>>> sections
{'Section2': ['Some text in section 2\n', 'Some text\n', 'text'], 'Section1': ['Some weired text in section 1\n', '\n']}

Edit: simplified code.

>>> import re
>>> patt = re.compile(r'^\s*\[\s*(section\d+)\s*\]\s*$', re.I)
>>> sections = defaultdict(list)
>>> with open('to_chew.txt') as to_chew:
...     for line in to_chew:
...         m = patt.match(line)
...         if m:
...             section_name = m.groups()[0]
...         else:
...             sections[section_name].append(line)
... 
>>> sections
defaultdict(<class 'list'>, {'Section1': ['Some weired text in section 1\n', '\n'], 'Section2': ['Some text in section 2\n', 'Some text\n', 'text']})

UnboundLocalError: local variable 'section_name' referenced before assignment — greenbeaf, Nov 12 '17 at 06:32
I suspect that you mistranscribed the code because I just ran it again, with success. — Bill Bell, Nov 12 '17 at 16:43

score 0 · Answer 3 · answered Aug 02 '17 at 10:44

0

try this,

text="""[Section1]
Some weired text in section 1

[Section2]
Some text in section 2
Some text
text"""
print text.split('\n\n')
>>>['[Section1]\nSome weired text in section 1', '[Section2]\nSome text in section 2\nSome text\ntext']

answered Aug 02 '17 at 10:44

Mohamed Thasin ah

10,754
11
52
111

Python. How to parse text by section?

3 Answers3