0

I have a CSV file with the following fields and data.

type,
state,
priority,
headline,

Defect,
Closed,
Very High,
Hello|World|data | this|has|problem,

when I use XML output, the resultant XML for tag headline becomes

<headline>Hello&#x7c;World&#x7c;data &#x7c; this&#x7c;has&#x7c;problem</headline> under UTF-8 encoding.

what should i do to get output as

 `<headline>Hello|World|data | this|has|problem</headline>`
Dev Utkarsh
  • 1,377
  • 2
  • 18
  • 43
  • Sorry, I misunderstood your question originally. You're asking how to get the XML output step to quit replacing the '|' characters with |, right? Well, I can't replicate your problem. I read in your CSV data and wrote it to an XML output, and all is well. There must be something else going on. Can you post your whole transform? – Brian.D.Myers Feb 28 '14 at 01:45
  • You want do decode the HTML entity? Maybe use the JavaScript-Step in Kettle with http://stackoverflow.com/questions/5796718/html-entity-decode – Christoph Mar 03 '14 at 22:55
  • 1
    check this answer: https://stackoverflow.com/questions/13635242/kettle-xml-output-step-changing-to-47/13648934#13648934 – jacktrade Mar 05 '14 at 20:50

1 Answers1

0

You may add a String Operations step to your stream and apply a Unescape HTML to this result and you'll get the desired result.

MrMauricioLeite
  • 383
  • 3
  • 10