Unpacking COMP-3 digit using Java

Question

I have a file with some COMP-3 encoded fields. Can someone please tell me how do I test this code in below thread ?

Code I tried is

BufferedReader br = new BufferedReader(new FileReader(FILENAME))) {

    String sCurrentLine;
    int i=0;
    String bf =null;
    while ((sCurrentLine = br.readLine()) != null) {
        i++;
        System.out.println("FROM BYTES ");
            System.out.println(unpackData(sCurrentLine.getBytes(), 5));

        for (int j = 0; j < sCurrentLine.length(); j++) {
            char c = sCurrentLine.charAt(j);
            bf =bf + (int)c;
        }

Above code is not giving correct result. I tried single column to convert but it is not returning correct result. My input column

input file looks like

I tried out JRecord passing cbl copybook and data file, it generate Java Code which is giving not same result Generated output

required output

cbl copy book is look like below image

COMP-3 is binary, so the `BufferedReader` is already wrong. You need to use an `InpitStream`. — user207421, Aug 10 '17 at 22:04

Bruce Martin · Answer 1 · 2017-08-09T11:43:01.010

0

The accepted answer might in How to unpack COMP-3 digits using Java? work if working with Ascii based Cobol. It will not work when reading Mainframe Ebcdic files with a FileReader.

You have marked the question is marked as Mainframe - Ebcdic.

To process correctly,

Do a Binary transfer from the Mainframe (or run on the mainframe). Do not do an ascii conversion, this will corrupt the comp-3 fields.
Read the file as a stream and process it as bytes.

The answer in COMP-3 data unpacking in Java (Embedded in Pentaho) will work; there are other answers on stackoverflow that will work as well.

Trying to process Comp-3 data as a character is error prone

JRecord

If you have a Cobol copybook, the JRecord library will let you read the file using a Cobol copybook. It contains a document ReadMe_NewUsers.html that goes through the basics.

RecordEditor

The Generate >>> Java~JRecord code for cobol menu option of the RecordEditor will generate Java~JRecord from a Cobol copybook (and optionally a data file).

There are details on generating code in this answer How do I identify the level of a field in copybook using JRecord in Java? or look at here

Also in the RecordEditor the Record Layouts >>> Load Cobol Copybook will load a Cobol copybook; you can then use the Layout to view the file.

edited Aug 09 '17 at 11:43

answered Aug 08 '17 at 12:56

Bruce Martin

10,358
1
27
38

Yeah, I have gone through the thread.
but we have compressed file, data is not in hex format(e.g. x'123f' ?? ) it has special characters.
how do I convert that file first to hex format ? I have Copybook of the data file. Some of fields is Char and others are fixed. – Rahul Patel Aug 08 '17 at 14:27
Without seeing the file, I can not help you. If the data is compressed - what is the compression format. The mainframe has its own hardware based compression + the usual compression algorithms (e.g. zip etc) can be run. – Bruce Martin Aug 09 '17 at 00:30
I am trying to generate equivalent Java Code. In Cobol code, Data type has been changed Fixed(15) INIT(0) to PIC'-----------9V.999' INIT in output layout, and they writing output file. I am working on same input file. For input file snapshot please check image above. – Rahul Patel Aug 09 '17 at 10:43
In Mainframe, they are FTPing the mainframe flat file to windows using binary format. This is technique they are using to generate text file. once file has been generated it look like above image ( Edited part). – Rahul Patel Aug 09 '17 at 10:58
If you have a Cobol copybook, have a look at this answer https://stackoverflow.com/questions/45529152/how-do-i-identify-the-level-of-a-field-in-copybook-using-jrecord-in-java/45557922#45557922 The Generate function should generate Java~JRecord code to read the file for you. There are several `templates`, the `standard` template will generate basic JRecord code; othere `templates` can convert the Cobol records in to `pojo's`. You will need the JRecord library: https://sourceforge.net/projects/jrecord – Bruce Martin Aug 09 '17 at 11:15
Please see Edited part above – Rahul Patel Aug 09 '17 at 15:18
Rahul, this should be a separate question and what have you done to solve the problem ???. It is quite easy - Hint try matching the Data with the Cobol Copybook – Bruce Martin Aug 09 '17 at 23:00
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/151550/discussion-between-bruce-martin-and-rahul-patel). – Bruce Martin Aug 09 '17 at 23:31

score 0 · Answer 2 · answered Aug 10 '17 at 06:43

The best way to manipulate packed decimal is to use the IBM Data Access Accelerator API. It uses an IBM specific JVM optimization called packed objects which is a technology for efficiently working on native data. There's some good Java code on SO for processing packed decimal data but the Data Access Accelerator is the sensible choice. It blows the RYO code away.

score 0 · Answer 3 · answered Aug 11 '17 at 11:53

0

If you compare the copybook with the data; you will see it does not match.

In particular Class-Order-edg is defines as pic 9(3) but it looks like it is binary in the file.

Bils-count-edg looks to be shifted 6 bytes. This is consistent with fields Class-order-edg --> Country-no-edg being changed to comp-3/comp. The copybook appears to be out of date.

answered Aug 11 '17 at 11:53

Bruce Martin

10,358
1
27
38

Please check out thread https://stackoverflow.com/questions/45637188/unpacking-comp-3-digit-using-record-editor-jrecord – Rahul Patel Aug 11 '17 at 14:06

sdalley · Answer 4 · 2017-08-11T15:51:38.430

The meaning of the term "COMP-3 encrypted file" is not clear to me, but I think you are saying that you have a file that was transferred from a zOS (EBCDIC) based system to an ASCII based system and you want to be able to process the values contained in the COMP-3 (packed decimal fields). If that is correct, I have some code and research that is relevant to your need.

I am assuming that the file was converted from EBCDIC to ASCII when it was transferred from zOS.

It is a common misconception that if COMP-3 (packed decimal) data is converted from EBCDIC to ASCII that it gets "corrupted". That is not the case. What you get are values ranging from x'00' - x'0F'. Regardless of whether you are are on an EBCDIC or ASCII based system, the hexadecimal values in that range are the same.

If the data is viewed outside of a hex editor [on either system] it appears to be corrupt. Depending on the code page, the packed decimal number 01234567890 may display as ⌁杅ྉ. However, using a hex editor you can see that the value is actually x'01 23 45 67 89 0F'. Two numbers are stored in a single byte (one digit in each nibble with the last nibble in the last byte being the sign). When each byte is converted from hex the actual numbers are returned. For example, using Lua, if the variable iChar contains x'23', the function oDec = string.format("%X", iChar) returns the text value of "23" which can be converted to a number. By iterating over the entire string of x'01 23 45 67 89 0F' the actual number (01234567890) is returned. The number can be "repacked" by reversing the process.

Sample code to unpack a packed decimal field is shown below:

--[[ Lua 5.2.3 ]]
--[[ Author: David Alley 
         Written: August 9, 2017 ]]
--[[ Begin Function ]]
function xdec_unpack (iHex, lField, lNumber)
--[[
This function reads packed decimal data (converted from EBCDIC to ASCII) as input
and returns unpacked ASCII decimal numbers.
--]]
    if iHex == nil or iHex == ""
        then
            return iHex
    end             
    local aChar = {}     
    local aUnpack = {} 
    local iChar = ''
    for i = 1, lField do
        aChar[i] = string.byte(iHex, i)
    end
    for i, iChar in ipairs(aChar) do
        local oDec = string.format("%X", iChar)
            if string.len(oDec) == 1
                then
            table.insert(aUnpack, "0" .. oDec) --[[ Handles binary zeros ]]
            else
                table.insert(aUnpack, oDec)
            end
    end
    if string.len(table.concat(aUnpack)) - 1 ~= lNumber
        then
            aUnpack[1] = string.sub(aUnpack[1],2,2)
    end
return table.concat(aUnpack)
end
--[[ End Function xdec_unpack ]]

--[[ The code below was written for Linux and reads an entire file. It assumes that there is only one field, and that 
         field is in packed decimal format. Packed decimal format means that the file was transferred from a z/OS (EBCDIC) system 
         and the data was converted to ASCII.

         It is necessary to supply the field length because when Lua encounters binary zeros (common in packed decimal), 
       they are treated as an "end of record" indicator. The packed length value is supplied by the variable lField and the
         unpacked length value is supplied by the variable lNumber. 

         Since there is only one field, that field by default, is at the end of each record (the field is the record). Therefore, 
         any "new line" values (0x0a for Linux) must be included when reading records. This is handled by adding 1 to the variable 
         lField when reading records. Therefore, this code must be modified if there are multiple fields, and/or the packed decimal
         field is not the last field in the record.

         The sign is dropped from the unpacked value that is returned from the function xdec_unpack by removing the last byte from the 
         variable Output1 before writing the records. Therefore, this code must be modified if it is necessary to process negative 
         numbers. ]]

local lField = 7      --[[ This variable is the length of the packed decimal field before unpacking and is required by the 
                                                 xdec_unpack function. ]]
local lNumber = 12  --[[ This variable is the length of the unpacked decimal field not including the sign. It is required by the 
                                                 xdec_unpack function. Its purpose is to determine if a high order zero (left zero) is to be removed. This 
                                                 occurs in situations where the packed decimal field contains an even number of digits. For example,
                                                 0123456789. ]]
local sFile = io.open("/home/david/Documents/Lua/Input/Input2.txt", "r")
local oFile = io.open("/home/david/Documents/Lua/Input/Output1.txt", "w")
while true do
    sFile:seek("cur")
    local sLine = sFile:read(lField + 1)        
    if sLine == nil then break end
    local Output1 = xdec_unpack(sLine, lField, lNumber) --[[ Call function to unpack ]]
  Output1 = string.sub(Output1,1, #Output1 - 1) --[[ Remove sign ]]
    oFile:write(Output1, "\n")
end
sFile:close()
oFile:close()

The notion of the data not getting "corrupted" in some way is just wrong. `x'40'` is a perfectly valid byte in a COMP-3 field and will be converted to `x'20'`. `x'30'` is a valid byte in a COMP-3 field and does not have a defined character associated with it. How will ASCII-conversion handle it? — piet.t, Aug 14 '17 at 07:51
COMP-3, by definition, is packed decimal. The only valid values are 0 - 9 and the letter that represents the sign (e.g. "F") which is the last nibble in the last byte. On z/OS, if you try to do arithmetic on a COMP-3 field and it contains x'40' or any non-number, the program abends with an S0C7 (data exception). Here is a quick reference: https://s0c7.blogspot.com/. Every byte contains 2 digits with the last byte containing 1 digit and the sign. — sdalley, Aug 26 '17 at 01:45
So if every byte contains two digits, what is wrong about `x'40'`? One digit `4`, one digit `0`, everything just as you described it. Note: I'm not talking about the last byte of the field, but a byte somewhere in the middle. — piet.t, Aug 28 '17 at 07:52
While the digits 4 and 0 may appear together in a packed decimal field, That is not equal to x'40' (a space in EBCDIC) and does not display as such. The value is 40, not x'40'. A packed decimal field must be processed in its entirety to be a valid number. The number 40 would be stored as x'04 0F' (assuming an unsigned number). — sdalley, Sep 09 '17 at 02:33
"A packed decimal field must be processed in its entirety" - and exactly that is what regualr file-transfer tools don't do. They look at one byte at a time and when they encounter a packed field with value `400` they will convert the byte-sequence `x'40 0F'` to `x'20 0F'` when converting EBCDIC to ASCII - assuming they are handling the shift-in character correctly... — piet.t, Sep 11 '17 at 05:55
When transferring packed decimal data from an EBCDIC to an ASCII system you always select the "binary" option. This prevents the data from being translated to ASCII and therefore, leaves the packed decimal fields intact. Then you can use the code I pasted above (or equivalent) to process those fields. Try it for yourself. That code was tested on EBCDIC data that was was transferred as I described. Obviously, if there are other types of data such as character text, that will need to be converted to ASCII, but that is a separate issue. — sdalley, Sep 11 '17 at 20:24
So here we agree: when transfering packed decimal fields it is mandatory to do so as binary and avoid any EBCDIC->ASCII conversion. But that seems to contradict what you wrote above: "It is a common misconception that if COMP-3 (packed decimal) data is converted from EBCDIC to ASCII that it gets "corrupted"." — piet.t, Sep 12 '17 at 05:46
You are correct. First I stated, "I am assuming that the file was converted from EBCDIC to ASCII when it was transferred from zOS." This was my response to "... data is not in hex format(e.g. x'123f' ?? ) it has special characters." Then I stated, "It is a common misconception that if COMP-3 (packed decimal) data is converted from EBCDIC to ASCII that it gets 'corrupted'." I used the word "converted" when I should have said "transferred". I should have also mentioned the "binary" option at that point. My apologies for the confusion. If there was any value, it was likely lost in the confusion. — sdalley, Sep 13 '17 at 00:15

Unpacking COMP-3 digit using Java

4 Answers4

JRecord

RecordEditor

Linked