2

I'm trying to convert a xml file to a csv file. I have an input xml file like this:

<Row>
  <Cell>
    <Data Type="String" >START</Data>
  </Cell>
  <Cell>
    <Data Type="DateTime" >2013-01-15T21:30:42</Data>
  </Cell>
  <Cell>
    <Data Type="String" ></Data>
  </Cell>
  <Cell>
    <Data Type="String" >Start 'suite8'</Data>
  </Cell>
  <Cell>
    <Data Type="String" >Test 'suite8' started</Data>
  </Cell>
  <Cell>
    <Data Type="String" ></Data>
  </Cell>
</Row>
<Row/>
<Row>
  <Cell>
    <Data Type="String" >START_TEST_CASE</Data>
  </Cell>
  <Cell>
    <Data Type="DateTime" >2013-01-15T21:30:42</Data>
  </Cell>
  <Cell>
    <Data Type="String" ></Data>
  </Cell>
  <Cell>
    <Data Type="String" >Start 'case1'</Data>
  </Cell>
  <Cell>
    <Data Type="String" >Test Case 'case1' started</Data>
  </Cell>
  <Cell>
    <Data Type="String" >case1</Data>
  </Cell>
</Row>

I'm interested in the bits between the tags <Data Type="String" > and </Data>. Also, a new line should be started when the tag <Row> appears.

The output csv file I want should look like this:

START,2013-01-15T21:30:42,,Test 'suite8' started 

START_TEST_CASE,2013-01-15T21:30:42,,Start 'case1',Test Case 'case1' started,case1

I hope this is clear enough, any help is greatly appreciated :) Thanks!

Babycece
  • 65
  • 1
  • 1
  • 8
  • Your best bet is a perl or python script which will be just a few lines but will be much more robust and faster to write than any shell/grep/sed/awk one might be able to get to work. Are you open to such an option? – sds Jan 16 '13 at 21:45
  • You do realise that your desired output isn't actually well-formed CSV? The numbers of "columns" between your rows differ.... – tink Jan 16 '13 at 21:47
  • You should use an XML parser. – squiguy Jan 16 '13 at 21:47
  • sds: I'm open to other options, but I know even less in perl or python script. I'm a newbie :( – Babycece Jan 16 '13 at 22:43
  • tink: I realised it now, thank you :) – Babycece Jan 16 '13 at 22:43

2 Answers2

2

Take a look at xslt stylesheets and the xsltproc command. If it is just converting unconditionally all data to rows with comma separated values from the cell tags it's a relatively simple stylesheet.

A quick search yielded this: XML to CSV Using XSLT With a few adaptations to your xml it should do what you need.

Community
  • 1
  • 1
Bernhard
  • 8,583
  • 4
  • 41
  • 42
0

Parsing XML with Bash has been addressed here before:

How to parse XML in Bash?

That said it seems like a painful way to live.

Community
  • 1
  • 1
Slartibartfast
  • 1,605
  • 2
  • 16
  • 23