0

Need to process the file using for loop

I have written below code to convert csv to xml. Here have written separate tag for each column.
In input file have column from 1 to 278. In output file need to have tag from A1 to A278,

Code :

file_in="Prepaid_plan_voucher.csv"
file_out="Prepaid_plan_voucher.xml"
echo '<?xml version="1.0"?>' > $file_out
#echo '<Customers>' >> $file_out
echo '  <TariffRecords>' >> $file_out
echo '  <Tariff>' >> $file_out
while IFS=$',' read -r -a arry
do
#  echo '  <TariffRecords>' >> $file_out
#  echo '  <Tariff>' >> $file_out
  echo '    <A1>'${arry[0]}'</A1>' >> $file_out
  echo '    <A2>'${arry[1]}'</A2>' >> $file_out
  echo '    <A3>'${arry[2]}'</A3>' >> $file_out
#  echo '  </TariffRecords>' >> $file_out
#  echo '  </Tariff>' >> $file_out
done < $file_in
#echo '</Customers>' >> $file_out
echo '  <TariffRecords>' >> $file_out
echo '  <Tariff>' >> $file_out

Sample Input file.(this is a sample record in actual input file will contain 278 columns). If input file has two or three records, same needs to be appended in one XML file.

name,Tariff Summary,Record ID No.,Operator Name,Circle (Service Area),list
Prepaid Plan Voucher,test_All calls 2p/s,TT07PMPV0188,Ta Te,Gu,
Prepaid Plan Voucher,test_All calls 3p/s,TT07PMPV0189,Ta Te,HR,

Sample output file The above two TariffRecords, tariff will be hard coded at the beginning and end of xml file.

<TariffRecords>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 2p/s</A2>
<A3>TT07PMPV0188</A3>
<A4>Ta Te</A4>
<A5>Gu</A5>
<A6></A6>
<Tariff>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 3p/s</A2>
<A3>TT07PMPV0189</A3>
<A4>Ta Te</A4>
<A5>HR</A5>
<A6></A6>
<Tariff>
<TariffRecords>
as7951
  • 187
  • 3
  • 11

2 Answers2

2

Though, this is not the most elegant solution, but I think you just want to simply do this, if I understand correctly. So doing as many modifications to your code as possible I got:

NUM_OF_COLS=5
echo '<TariffRecords>' >> $file_out
while IFS=$',' read -r -a arry
do
  tariff="  <Tariff>\n"
  for i in $(seq 0 $NUM_OF_COLS); do
    tariff="${tariff}    <A$i>${arry[$i]}</A$i>\n"
  done
  tariff="${tariff}  </Tariff>"
  echo -e ${tariff} >> $file_out
done < <(tail -n +1 $file_in)
echo '</TariffRecords>' >> $file_out

Things to note:

We are skipping CSV header by:

<(tail -n +1 $file_in)

Generate "foeach" cycle in range from 0 to $NUM_OF_COLS, which represents column's indices by:

$(seq 0 $NUM_OF_COLS)

Append string by:

tariff="${tariff}......"

Using

echo -e ...

in order to preserve new lines and nice formatting, but you might use another bash utility like xmllint in order to pretty formatting.

EDIT: For mulitple files

In order to process multiple files, replace hardcoded:

file_in="Prepaid_plan_voucher.csv"
file_out="Prepaid_plan_voucher.xml"

by

file_in="$1" # Take the name as an argument from command line
file_out="${1%.csv}.xml" # Remove csv suffix and append xml

and run the script from command line for every csv file, e.g. like this:

$ for f in $(ls *.csv); do ./ourscript.sh $f; done
hradecek
  • 2,455
  • 2
  • 21
  • 30
  • @hradecek...Able to create xml file but not able to read it as XML file. It is coming as simple text file only when i am trying to open in windows in browser.. Please help – as7951 Jun 19 '18 at 09:13
  • Sorry for the typo. I had `` instead of closing tag ``. Fixed. btw: XML (data representation) is usually stored in a text file. – hradecek Jun 19 '18 at 09:29
  • Hi @hradecek..Can you tell me how can i modify this code so that i can pick multiple csv file with *.csv and then generate separate xml for each file – as7951 Jun 21 '18 at 10:26
  • @as7951 - I have edited an answer with one possible solution (the least "obtrusive" meaning *It will do the job*). – hradecek Jun 21 '18 at 11:44
2

Since it was mentioned in the comments, here's an option using XSLT 3.0.

The processor I tested with is Saxon-HE 9.8 and is run with a java command line. It should be easy to incorporate into a shell script to process multiple files.

CSV Input (added an additional row to show handling of another empty entry and a quoted entry that contains commas that aren't separators)

name,Tariff Summary,Record ID No.,Operator Name,Circle (Service Area),list
Prepaid Plan Voucher,test_All calls 2p/s,TT07PMPV0188,Ta Te,Gu,
Prepaid Plan Voucher,test_All calls 3p/s,TT07PMPV0189,Ta Te,HR,
Prepaid Plan Voucher,,TT07PMPV0190,Ta Te,DH,"some,comma,separated,list"

XSLT 3.0

<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" expand-text="yes">
  <xsl:output method="xml" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:param name="csv-uri"/>
  <xsl:param name="csv-encoding" select="'UTF-8'"/>

  <xsl:template name="init">
    <TariffRecords>
      <xsl:choose>
        <xsl:when test="unparsed-text-available($csv-uri, $csv-encoding)">
          <xsl:call-template name="csv2xml"/>                               
        </xsl:when>
        <xsl:otherwise>
          <xsl:variable name="error">
            <xsl:text>Error reading "{$csv-uri}" (encoding "{$csv-encoding}").</xsl:text>
          </xsl:variable>
          <xsl:message><xsl:value-of select="$error"/></xsl:message>
        </xsl:otherwise>
      </xsl:choose>
    </TariffRecords>
  </xsl:template>

  <xsl:template name="csv2xml">
    <xsl:variable name="csv_content" select="unparsed-text($csv-uri, $csv-encoding)"/>
    <xsl:analyze-string select="$csv_content" regex="\r?\n">
      <xsl:non-matching-substring>
        <xsl:if test="position() > 1"><!--ignore header-->
          <Tariff>
            <xsl:analyze-string select="concat(.,',')" regex='"([^"]*)",?|([^,]+),?'>
              <!--group 1 is wrapped in quotes-->
              <!--group 2 is not wrapped quotes-->
              <xsl:matching-substring>
                <xsl:element name="A{position()}">
                  <xsl:value-of select="(regex-group(1),regex-group(2))" separator=""/>
                </xsl:element>
              </xsl:matching-substring>
              <xsl:non-matching-substring>
                <xsl:element name="A{position()}"/>
              </xsl:non-matching-substring>
            </xsl:analyze-string>
          </Tariff>          
        </xsl:if>
      </xsl:non-matching-substring>      
    </xsl:analyze-string>
  </xsl:template>

</xsl:stylesheet>

Command line (see here for more info on running Saxon from the command line)

java -cp "C:/apps/SaxonHE9-8-0-11J/saxon9he.jar" net.sf.saxon.Transform -it:init -xsl:"csv2xml.xsl" -o:"output.xml" csv-uri="input.csv"

Output

<?xml version="1.0" encoding="UTF-8"?>
<TariffRecords>
   <Tariff>
      <A1>Prepaid Plan Voucher</A1>
      <A2>test_All calls 2p/s</A2>
      <A3>TT07PMPV0188</A3>
      <A4>Ta Te</A4>
      <A5>Gu</A5>
      <A6/>
   </Tariff>
   <Tariff>
      <A1>Prepaid Plan Voucher</A1>
      <A2>test_All calls 3p/s</A2>
      <A3>TT07PMPV0189</A3>
      <A4>Ta Te</A4>
      <A5>HR</A5>
      <A6/>
   </Tariff>
   <Tariff>
      <A1>Prepaid Plan Voucher</A1>
      <A2/>
      <A3>TT07PMPV0190</A3>
      <A4>Ta Te</A4>
      <A5>DH</A5>
      <A6>some,comma,separated,list</A6>
   </Tariff>
</TariffRecords>
Daniel Haley
  • 51,389
  • 6
  • 69
  • 95