1

i am trying to do some sample data analysis using hadoop so i found some xml data like

<root>
  <title>Document Title</title>
  <content>Some document content.</content>
  <keywords>test, document, keyword</keywords>
</root>

how can i convert this into csv i.e

Document Title,Some document content.,test, document, keyword

vakul
  • 157
  • 3
  • 17
  • A simple way would be to read the XML, get the node values, and convert it into a CSV. Give it a try and check back if you face any issue with your code. – Pankaj Jaju Jan 23 '14 at 16:25
  • Or you can google ... there are plenty of free tools available :) – Pankaj Jaju Jan 23 '14 at 16:27
  • possible duplicate of [XML to CSV Using XSLT](http://stackoverflow.com/questions/365312/xml-to-csv-using-xslt) – Louis Jan 23 '14 at 17:24
  • @Louis i would like to know if there is a way that i could create an application like that one – vakul Jan 23 '14 at 18:17
  • Please be specific around column conversions. The example you provided could lead to an variable number of columns since there are a variable number of keywords. Consider specify if all tag nodes below the root are non-repeating and if escape quotes are needed "dd, cc" as some strings can have commas. – Ted Johnson Jan 30 '14 at 06:55

1 Answers1

0

Found a XML transform stylesheet

The stylesheet there could be helpful:

<xsl:stylesheet version="1.0"
<xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="iso-8859-1"/>

<xsl:strip-space elements="*" />

<xsl:template match="/*/child::*">
<xsl:for-each select="child::*">
<xsl:if test="position() != last()">"<xsl:value-of select="normalize-space(.)"/>",                    </xsl:if>
<xsl:if test="position()  = last()">"<xsl:value-of select="normalize-space(.)"/>"      <xsl:text>&#xD;</xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:template>

</xsl:stylesheet>

Perhaps you want to remove the quotes inside the xsl:if tags so it doesn't put your values into quotes, depending on where you want to use the CSV file.

vakul
  • 157
  • 3
  • 17