Convert COMP and COMP-3 Packed Decimal into readable value with C

Question

I have an EBCDIC flat file to be processed from a mainframe into a C module. What can be a good process in converting the COMP and COMP-3 values into readable values? Do I have to convert the ebcdic characters to ascii then hex for COMP-3? What about for COMP? Thanks

By far the easiest thing to do is to create the file for the ASCII system without any non-display fields. It would be a simple COPY step with SORT and then nothing for you to do at your end. If somehow that is not possible, have a look through some of the questions here tagged `ebcdic` — Bill Woodger, Mar 12 '14 at 09:49
So you're saying it would be better if my flat file was in ASCII format? — entendezEJ, Mar 13 '14 at 06:34
What we (from the Mainframe) usually try to do, OK, perhaps just those who've been around a bit longer, is to specially create a file for the specific task, which is a logical copy of the data physically in "character" format. That is EBCDIC when we create it. Then, using whatever utility to transfer it to a non-Mainframe machine (FTP, NDM, anything) get the utility to do its built-in EBCDIC-to-ASCII translation. Then the receive just needs to check headers and trailers (date expected, logical-file-name expected, record counts, some hash totals, etc). Then your file arrives in ASCII. — Bill Woodger, Mar 13 '14 at 07:51
You should have an agreement, signed-off before you start work, as to how this is supposed to happen. Your side should argue for character data, they will moan about needing an extra program. They are either lying to you, or are incompetent, or just plain unprofessional. There should be no need for you to touch the data before it can be used (not data from a bank, or not been run past audit/compliance/regulatory/legal people). That is plain wrong. People do it. Check the logical content should be all that you are required to do before processing. — Bill Woodger, Mar 13 '14 at 07:54
I'm actually new to the project so I don't know much of the details of the project. I'll ask for that in a while. Yeah, I've already considered using a translation table for this. I was just hoping that there's another way around that. — entendezEJ, Mar 13 '14 at 08:02
They're asking me to do it all in the mainframe. Like one whole program to do the COMP and COMP-3 characters to readable data from EBCDIC values. So far, I can't get my mind around it yet. Thanks Bill for your time. — entendezEJ, Mar 13 '14 at 08:06
Two easy options are a COBOL program or SORT. Find a COBOL program that reads and writes a sequential file. Delete all but the IO processing stuff. Use existing copybook for the input record, and create a new copybook (or simple layout under the FD without a copybook) without the COMP/COMP-3 (so they will be USAGE DISPLAY by default) give the fields different names for the output. "Chop about" copies of your record-layouts so you only have the names of individual data-items, and add MOVE to the start of the input names and TO to the start of the output names (space in between, of course) — Bill Woodger, Mar 13 '14 at 12:41
Then use the editor, or anything else you fancy, to "merge" the lines on a one-for-one basis, so you get your lines of code. Copy into your program. Nearly there. You just need to consider any signs you have and what you want to do about any decimals places you have. — Bill Woodger, Mar 13 '14 at 12:43
For signs, look in the COBOL manual (Enterprise COBOL Language Reference, any version) for `SIGN IS SEPARATE` and choose how you want it to appear. If actual decimal-points are useful to you, change the implied decimal-point (the V in a numeric `PIC`ture) to a `.`. If you need more assistance, or you want to consider the SORT option, please ask a new question. — Bill Woodger, Mar 13 '14 at 12:47

score 1 · Accepted Answer · answered Mar 13 '14 at 15:35

Bill Woodger has given you some very good advice through his comments to your question, actually he answered the question and should have posted his comments as an answer.

I would like to reiterate a few of his points and expand on a few others.

If you need to convert a file created from what is probably a COBOL application so it may be read by some other non-COBOL program, possibly on a machine with an architecture unlike the one where it was created, then you should demand that the file be created using only display formatted data (i.e. all character data). Mashing non-display (binary, packed, encoded) data outside of the operating environment where it was created is just a formula for long term pain. You will be subjected to the joys of sorting out various endianness issues between architectures and code page conversions. These are the things that file transfer protocols are designed to manage - they do it well so don't try to reinvent them. Short answer, use FTP or similar file transport mechanism to move data between machines. And only transport display (character) based data.

Packed Decimal (COMP-3) data types occupy a varying number of bytes depending on their specific PICTURE layout. The position of the decimal point is implied so cannot be determined without reference to the PICTURE used to define it. Packed Decimal fields may be either signed or unsigned. If signed, the sign is imbedded in the low 4 bits of the least significant digit. Each byte of a Packed Decimal data type contains two digits, except possibly the first and last bytes. The first byte contains only 1 digit if the field is signed and contains an even number of digits. The last byte contains 2 digits if unsigned but only 1 if signed. There are several other subtlies that you need to be aware of if you want to do your own Packed Decimal to character conversions. At this point I hope you can see that this is not going to be a trivial exercise.

Binary (COMP) data types have a different but no less complex set of issues to resolve. Again, not a trivial exercise.

So what should you be doing? Basically, do as Bill suggested. Have the program that generates this file use display formats for output (meaning you have to do nothing). Or, failing that, use a utility program such as DFSORT/SYNCSORT do the conversions for you. Going the utility route still requires that you have the original COBOL file layout (and that you understand it) in order to do the conversion. The last resort is simply writing a simple read-a-record-write-a-record COBOL program that takes in the unformatted data, MOVEes each COMP-whatever field to a corresponding DISPLAY field and write it out again.

As Bill said, if the group that produced this file tells you that it is too difficult/expensive to produce a DISPLAY formatted output file they are lying to you or they are incompetent or just too lazy to do the job they were hired to do. I can think of no other excuses.

Yes, but the question by this time was in a comment :-) – Bill Woodger Mar 13 '14 at 17:43 — Bill Woodger, Mar 13 '14 at 17:43

score 0 · Answer 2 · answered Mar 17 '14 at 23:16

0

Use XML to transport data.

That is, write a program that converts your file into characters (if on mainframe, stay with the EBCIDIC but numeric fields are unpacked, etc.) and then enclose each record and each field in XML tags.

This avoids formatting issues (what field is in column 1, what field in column 2, are the delimters spaces or commas or either, etc. ad nauseum).

Then transmit the XML file with your favorite utility that converts from EBCIDIC to ASCII.

answered Mar 17 '14 at 23:16

JackCColeman

3,777
1
15
21

The data is in fixed positions, so no problem with what is where. OP hasn't asked for XML. Even with delimited vs XML there is performance (number of fields, number of records) to be taken into account. – Bill Woodger Mar 18 '14 at 12:48
Bill Woodger, I rarely let OP define technical solutions (XML vs. text). Performance considerations apply when considering O(n) vs. O(n*2). Performance is not a consideration when considering a 100 byte record vs. 500 byte record. If necessary, for purposes of transmission compress the data set. – JackCColeman Mar 24 '14 at 17:37
OK, sorry, didn't know they worked for you. They seem to be ignoring your spec, you'd better sort that out. Number of fields, I said, not number of bytes. If you parse a lot of XML and compare to parsing an equivalent number of delimiters, you should see a difference. A further, bigger, difference to the equivalent fixed-length fields. Odd to suggest XML for fixed-length fields, but you're the boss. – Bill Woodger Mar 24 '14 at 18:36
@Bill Woodger, "should see a difference", ah yes, but do you? Not likely unless there is an order of magnitude difference between the two methods. The strength of XML is that it knows nothing about fixed or variable length fields. – JackCColeman Mar 24 '14 at 20:50

Convert COMP and COMP-3 Packed Decimal into readable value with C

2 Answers2

Linked