EBCDIC unpacking comp-3 data returns 40404** in Java

Question

I have used the unpack data logic provided in below link for java How to unpack COMP-3 digits using Java? But for the null data in source it returns 404040404 like on Java unpack code. I understand this was space in ebcdic, but how to unpack by handling this space or to avoid it.

You could simply compare the field with 0x40 in each character then replace it with 0x00 for all except the last character and make that 0x0f. So if its a 3 byte field you would search for 0x40 in [0], [1] and [2] and if its 0x40 in each then replace the fields with [0] = 0x00 [1] = 0x00 [2] = 0x0f and that will unpack to 0 — Hogstrom, Apr 01 '19 at 12:57
i tried it replaces the numeric value 4 as well. char[] NON_PRINTABLE_EBCDIC_CHARS = new char[] { 0x00, 0x40 }; for (int currentByteIndex = 0; currentByteIndex < packedData.length; currentByteIndex++) { //added for comp and comp3 datatype issue with null int character = packedData[currentByteIndex]; boolean isBlank = false; for (char nonPrintableChar : NON_PRINTABLE_EBCDIC_CHARS) { if (nonPrintableChar == (char) character) { isBlank=true; } } if(!isBlank) { — Rajesh, Apr 01 '19 at 13:02
if the source value is 444. Will it decode correctly? it wont replace with zero? — Rajesh, Apr 01 '19 at 13:05
If you're seeing spaces, you likely have a input data error or misalignment. No well-designed COBOL program would ever place spaces in a COMP-3 field. — zarchasmpgmr, Apr 01 '19 at 15:59
I would not check every byte but just look at the last byte. If it contains a valid sign-nibble I would go along with trying to unpack the field and come up with some strategy for all other cases (return 0, throw an error,...?). — piet.t, Apr 02 '19 at 07:22

score 1 · Accepted Answer · answered Apr 04 '19 at 20:58

There are two problems that we have to deal with. First, is the data valid comp-3 data and second, is the data considered “valid” by older language implementations like COBOL since Comp-3 was mentioned.

If the offests are not misaligned it would appear that spaces are being interpreted by existing programs as 0 instead of spaces. This would be incorrect but could be an artifact of older programs that were engineered to tolerate this bad behaviour.

The approach I would take in a legacy shop (assuming no misalignment) is to consider “spaces” (which are sequences of 0x404040404040) as being zero. This would be a legacy check to compare the field with spaces and then assume that 0x00000000000f as the actual default. This is something an individual shop would have to determine and is not recognized as a general programming approach.

In terms of Java, one has to remember that bytes are “signed” so comparisons can be tricky based on how the code is written. The only “unsigned” data type I recall in java is char which is really two bytes (unit 16) basically.

This is less of a programming problem than it is recognizing historical tolerance and remediation.

EBCDIC unpacking comp-3 data returns 40404** in Java

1 Answers1