CONCLUSION:
For some reason the flow wouldn't let me convert the incoming message to a BLOB by changing the Message Domain property of the Input Node so I added a Reset Content Descriptor node before the Compute Node with the code from the accepted answer. On the line that parses the XML and creates the XMLNSC Child for the message I was getting a 'CHARACTER:Invalid wire format received' error so I took that line out and added another Reset Content Descriptor node after the Compute Node instead. Now it parses and replaces the Unicode characters with spaces. So now it doesn't crash.
Here is the code for the added Compute Node:
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
DECLARE NonPrintable BLOB X'0001020304050607080B0C0E0F101112131415161718191A1B1C1D1E1F7F808182838485868788898A8B8C8D8E8F909192939495969798999A9B9C9D9E9FA0A1A2A3A4A5A6A7A8A9AAABACADAEAFB0B1B2B3B4B5B6B7B8B9BABBBCBDBEBFC0C1C2C3C4C5C6C7C8C9CACBCCCDCECFD0D1D2D3D4D5D6D7D8D9DADBDCDDDEDFE0E1E2E3E4E5E6E7E8E9EAEBECEDEEEFF1F2F3F4F5F6F7F8F9FAFBFCFDFEFF';
DECLARE Printable BLOB X'20202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020';
DECLARE Fixed BLOB TRANSLATE(InputRoot.BLOB.BLOB, NonPrintable, Printable);
SET OutputRoot = InputRoot;
SET OutputRoot.BLOB.BLOB = Fixed;
RETURN TRUE;
END;
UPDATE:
The message is being parsed as XML using XMLNSC. Thought that would cause a problem, but it does not appear to be.
Now I'm using PHP. I've created a node to plug into the legacy flow. Here's the relevant code:
class fixIncompetence {
function evaluate ($output_assembly,$input_assembly) {
$output_assembly->MRM = $input_assembly->MRM;
$output_assembly->MQMD = $input_assembly->MQMD;
$tmp = htmlentities($input_assembly->MRM->VALUE_TO_FIX, ENT_HTML5|ENT_SUBSTITUTE,'UTF-8');
if (!empty($tmp)) {
$output_assembly->MRM->VALUE_TO_FIX = $tmp;
}
// Ensure there are no null MRM fields. MessageBroker is strict.
foreach ($output_assembly->MRM as $key => $val) {
if (empty($val)) {
$output_assembly->MRM->$key = '';
}
}
}
}
Right now I'm getting a vague error about read only messages, but before that it wasn't working either.
Original Question:
For some reason I am unable to impress upon the senders of our MQ messages that smart quotes, endashes, emdashes, and such crash our XML parser.
I managed to make a working solution with SQL queries, but it wasted too many resources. Here's the last thing I tried, but it didn't work either:
CREATE FUNCTION CLEAN(IN STR CHAR) RETURNS CHAR BEGIN SET STR = REPLACE('–',STR,'–'); SET STR = REPLACE('—',STR,'—'); SET STR = REPLACE('·',STR,'·'); SET STR = REPLACE('“',STR,'“'); SET STR = REPLACE('”',STR,'”'); SET STR = REPLACE('‘',STR,'&lsqo;'); SET STR = REPLACE('’',STR,'’'); SET STR = REPLACE('•',STR,'•'); SET STR = REPLACE('°',STR,'°'); RETURN STR; END;
As you can see I'm not very good at this. I have tried reading about various ESQL string functions without much success.