I need to extract patient info from the HL7 XML document using Apache Nifi, and to apply regex to extract diagnostic results from the sections that contain embedded HTML (yes, sorry. not my design choice :-( )
First path to data of interest in the HL7 is:
"ClinicalDocument" \ "recordTarget" \ "patientRole" \ "patient" \ "name",
and the second, more complicated one is:
"ClinicalDocument" \ "structuredBody" \ "component" \ "section" \ "text @mediaType="text/x-hl7-text+xml"" where the value of the title element equals to "Diagnostic Results"
I need to match on text of the sub-node text value of the title of the section within component that has value "Diagnostic Results" (Diagnostic Results), and then extract the text value of the peer node text.
My HL7 XML snippets look like:
</ClinicalDocument>
...
<recordTarget>
<patientRole>
....
<patient>
<name><given>John</given><family>Doe</family></name>
...
<structuredBody>
...
<component>
<section classCode="DOCSECT" moodCode="EVN">
<templateId root="0.0.0.0.0.0.1" />
<code code="000-01" codeSystem="0.0.0.1.0.0" />
<title>Diagnostic Results</title>
<text mediaType="text/x-hl7-text+xml">
Some data of interest expressed in n microns.<content ID="NKN_results"/>
</text>
Any suggestions on how do I do this in Apache Nifi?