How to split up a file by keyword?

Question

I have a large XML file that looks like

<data> skdfnlsniisimsoinfsdfoisdfinsdofinodnfonf <emrosem> 23324097234097g </emrosem> 

<peto> oifmisnie </peto>

</data>

<data> sfnseosfnosefoisneofinseionfoaisenfoisen <emrosem> 3249087203470w </emrosem>

<peto> sdfn </peto>

</data>

I want to separate this into a list that looks like

 [<data> skdfnlsniisimsoinfsdfoisdfinsdofinodnfonf <emrosem> 23324097234097g </emrosem> 
 <peto> oifmisnie </peto></data>, <data> sfnseosfnosefoisneofinseionfoaisenfoisen             
 <emrosem> 3249087203470w </emrosem> <peto> sdfn </peto> </data>]

In other words, I want to split it based on the word "data".

I'm using python 2.7, thanks for the help.

score 2 · Accepted Answer · answered Jul 12 '11 at 19:48

2

The included XML Parser is one way to parse XML. It might be a bit kludgey to get data off of it and into a list with the tags intact but it should be doable.

answered Jul 12 '11 at 19:48

thegrinner

11,546
5
41
64

Alright, thanks. It's really difficult to deal with the intricacies of XML, so I should probably just learn2parse. Thanks :D – TheWarmthOfTheSun Jul 12 '11 at 19:58

score 0 · Answer 2 · edited May 23 '17 at 10:24

0

Please don't use regular expressions for this. If you need to parse XML, use an XML parser. XML just has too many subtleties to handle it with simple string manipulation routines. For a nice explanation as to why, see the first answer to this question.

edited May 23 '17 at 10:24

Community

1
1

answered Jul 12 '11 at 19:50

tdammers

20,353
1
39
56

How to split up a file by keyword?

2 Answers2