2

We are facing a challenge in reading the COMP-3 data in Java embedded inside Pentaho ETL. There are few Float values stored as packed decimals in a flat file along with other plain text. While the plain texts are getting read properly, we tried using Charset.forName("CP500");, but it never worked. We still get junk characters.

Since Pentaho scripts doesn't support COMP-3, in their forums they suggested to go with User Defined Java class. Could anyone help us if you have come across and solved such?

mzy
  • 1,754
  • 2
  • 20
  • 36
Guru
  • 43
  • 2
  • 6
  • 2
    Why don't you get the file changed you that it doesn't have packed-decimal values, but "character" values, with an explicit sign, and either an explicit decimal-point or a "scaling factor", whichever is easier for you. These are not "Float values", they are 100% accurate decimal values of a fixed size (and fixed number of decimal places). – Bill Woodger Feb 15 '16 at 18:06
  • @Guru did you solve the issue? – sunleo Sep 19 '17 at 12:25
  • @sunleo I solved this by having a pearl script between Java and Pentaho. It was very easy to convert the digits using a Pearl Script. – Guru Sep 21 '17 at 17:31

1 Answers1

1

Is it a Cobol File ???, Do you have a Cobol Copybook ???. Possible options include

  1. As Bill said Convert the Comp-3 to Text on the source machine
  2. Write your own Conversion Code
  3. Use a library like JRecord. Note: I am the author of JRecord

Converting Comp-3

in Comp-3,

Value    Comp-3 (signed)   Comp-3 (Unsigned)   Zoned-Decimal
 123     x'123c'           x'123f' ??            "12C"
-123     x'123d'                                 "12L" 

There is more than one way to convert a comp-3 to a decimal integer. One way is to

  1. Connvert x'123c' ->> String "123c"
  2. Drop the last character and test for the sign

Java Code to convert comp3 (from a byte array:

        public static String getMainframePackedDecimal(final byte[] record,
                                               final int start,
                                               final int len) {  

            String hex  = getDecimal(record, start, start + len);
                //Long.toHexString(toBigInt(start, len).longValue());
            String ret  = "";
            String sign = "";

            if (! "".equals(hex)) {
                switch (hex.substring(hex.length() - 1).toLowerCase().charAt(0)) {
                    case 'd' : sign = "-";
                        case 'a' :
                        case 'b' :
                        case 'c' :
                        case 'e' :
                        case 'f' :
                            ret = sign + hex.substring(0, hex.length() - 1);
                        break;
                        default:
                            ret = hex;
                }
            }

            if ("".equals(ret)) {
                ret = "0";
            }
        }

        public static String getDecimal(final byte[] record, final int start, final int fin) {
            int i;
            String s;
            StringBuffer ret = new StringBuffer("");
            int b;

            for (i = start; i < fin; i++) {
                b = toPostiveByte(record[i]);
                s = Integer.toHexString(b);
                if (s.length() == 1) {
                    ret.append('0');
                }
                ret.append(s);

            }

            return ret.toString();
        }

JRecord

In JRecord, if you have a Cobol Copybook, there is

  • Cobol2Csv a program to convert a Cobol-Data file to CSV using a Cobol Copybook
  • Data2Xml convert a Cobol Data file to Xml using a Cobol Copybook.
  • Read Cobol-Data File with a Cobol Copybook.
  • Read a Fixed width file with a Xml Description
  • Define the Fields in Java
Reading with Cobol Copybook in JRecord
        ICobolIOBuilder ioBldr = JRecordInterface1.COBOL
                .newIOBuilder(copybookName)
                    .setDialect( ICopybookDialects.FMT_MAINFRAME)
                    .setFont("cp037")
                    .setFileOrganization(Constants.IO_FIXED_LENGTH)
                .setDropCopybookNameFromFields(true);
        AbstractLine saleRecord;

        AbstractLineReader reader  = ioBldr.newReader(salesFile);
        while ((saleRecord = reader.read()) != null) {
            ....
        }

        reader.close();
Defining the File in Java with JRecord
        AbstractLineReader reader = JRecordInterface1.FIXED_WIDTH.newIOBuilder()
                                .defineFieldsByLength()
                                    .addFieldByLength("Sku"  , Type.ftChar,   8, 0)
                                    .addFieldByLength("Store", Type.ftNumRightJustified, 3, 0)
                                    .addFieldByLength("Date" , Type.ftNumRightJustified, 6, 0)
                                    .addFieldByLength("Dept" , Type.ftNumRightJustified, 3, 0)
                                    .addFieldByLength("Qty"  , Type.ftNumRightJustified, 2, 0)
                                    .addFieldByLength("Price", Type.ftNumRightJustified, 6, 2)
                                .endOfRecord()
                                .newReader(this.getClass().getResource("DTAR020_tst1.bin.txt").getFile());
        AbstractLine saleRecord;

        while ((saleRecord = reader.read()) != null) {
        }

Zoned Decimal

Another Mainframe-Cobol numeric format is Zoned-Decimal. It is a text format where the sign is Over-typed on the last digit. In zoned-decimal 123 is "12C" while -123 is "12L".

Bruce Martin
  • 10,358
  • 1
  • 27
  • 38
  • I did have a look at JRecord. But the file which I am going to process is a plain text flat file. Just that 5 columns in that file are of COMP-3 type. – Guru Feb 16 '16 at 00:43
  • I tried some conversion code based on several examples. The result values did not match the expected value. Since the Data Junction tool (Which we used for data extract) supports COMP-3 datatype's encoding, it by default converts the data and we use that data for validation. What Data Junction tool does is what we are trying achieve using Java and that is the requirement. – Guru Feb 16 '16 at 00:52
  • How did you read the file ????, You need to read it as bytes via stream. If you read the file as Text (e.g. via FileReader for example), you will **corrupt** the Comp-3 values. Can you provide the code you are using ???. – Bruce Martin Feb 16 '16 at 01:54
  • One final question, are you positive it is comp-3 and not Zoned-Decimal. I will add Zoned-Decimal details to the answer – Bruce Martin Feb 16 '16 at 03:04
  • Yeah. I am sure it is COMP-3 as Data Junction ETL tool is able to decode it. – Guru Feb 16 '16 at 03:51
  • 1
    @Guru Can you show a sample of your data, in hex, in you question. I think Data Junction seems to understand BCD as well, which is not packed-decimal, but is often referred to as such. – Bill Woodger Feb 16 '16 at 14:22