Check Half width Katakana character in COBOL

Question

I'm working on RedHat6 and using COBOL. I wanna check every single digit of variable, if it's half width --> CONTINUE, Else --> DISPLAY ERROR. Basicly I list all half width characters in WHEN Clause of EVALUATE statement. Like this:

PERFORM VARYING WK-IX FROM 1 BY 1 UNTIL WK-IX > WK-LENGTH
    EVALUATE WK-FORMAT-CHK-VALUE(WK-IX:1)
        WHEN 'A'
        WHEN 'B'
        WHEN 'C'
            CONTINUE
        WHEN OTHER
            DISPLAY 'ERROR'
    END-EVALUATE
END-PERFORM.

Everything is OK but when compile I have problem with half width katakana character. It said: "The ending quotation mark of the literal is missing. The characters at the end of Area B are assumed to be a literal" with all line check these character:

ﾂﾃﾄﾅﾆﾇﾈﾉﾊﾋﾌﾍﾎﾏﾐﾑﾒﾓﾔﾕﾖﾗﾘﾙﾚﾛﾜｦﾝ

Although I sure there isn't any line of code miss the ending quotation mark. Like this:

WHEN 'ﾂ'
WHEN 'ﾃ'
WHEN 'ﾄ'

But these character is OK and I don't know why:

ｱｲｳｴｵｶｷｸｹｺｻｼｽｾｿﾀﾁ

Anyone can help me? Please! Sorry for my bad English!

Those characters will be "multi-byte", probably. Did you look at your documentation? What is the definition of the field you are checking? — Bill Woodger, Mar 08 '17 at 09:43
The field I'm checking have type X. And in my document it said X ：半角英数文字". I don't know Japanese, so I use google translate it. It means: Half-width alphanumeric characters. I think if I check character is half-width, I'll need checking alphabeta, numeric and hiragana, katakana (Japanese's alphabet) — tieuquynd, Mar 08 '17 at 10:22
You haven't said which compiler you use. Do you have colleagues? What do they say? The first on your "doesn't work" list looks like the same on your "works" list. Typo? — Bill Woodger, Mar 08 '17 at 11:21
Sorry, that's typo. The 'ﾂ' character doesn't have in works list. I use compiler which customer supply. It's doesn't have name. Just put source at and type command in terminal to compile. My colleagues and I are all newbie in COBOL. They don't know for sure, too. — tieuquynd, Mar 09 '17 at 03:37
Since you can't form a simple literal with, I'd have to think some are single-byte and some are double-byte. For the ones which fail, try to see if the literal is accepted by using N'failingsymbol' as @saggingrufus shows. You could also try to look at the source code file with a hex-editor, there is something about that literal that the compiler does not like. It may be multi-byte in the source representation, even if the character-set you use is your COBOL program has it as a single byte character. — Bill Woodger, Mar 09 '17 at 08:39

score 1 · Answer 1 · answered Mar 08 '17 at 11:32

1

Because the Katakana character set is considered a multi-byte character set (as mentioned by Bill Woodger), you will need to ensure that the NSYMBOL and DBCS compile options are enabled. After that, you should be able to define the literals like this:

EVALUATE WK-FORMAT-CHK-VALUE(WK-IX:1)    
   WHEN N'ﾂ'
   WHEN N'ﾃ'
   WHEN N'ﾄ'
      do something
   WHEN OTHER
      do something else
END-EVALUATE

the N will tell the program that this is a national character and as such is multi-byte.

Your input to the evaluate clause will also need to be defined as a PIC N rather than a PIC X. A PIC X field will not recognise double byte characters.

answered Mar 08 '17 at 11:32

SaggingRufus

1,814
16
32

The single-byte character is 1-byte and multi-buyte character is 2-byte, right? Because there're two way to type a character in Japanese (half-width and full-width). The characters that I checked were 1-byte character. 1-byte: ｱｲｳｴｵｶｷｸｹｺｻｼｽｾｿﾀﾁﾂﾃﾄﾅﾆﾇﾈﾉﾊﾋﾌﾍﾎﾏﾐﾑﾒﾓﾔﾕﾖﾗﾘﾙﾚﾛﾜｦﾝ ......... 2-byte: アイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワヲン – tieuquynd Mar 09 '17 at 03:58
multi-byte just means more than one. In my experience, they are typically double byte, but there isn't anything stopping them from being more than 2 bytes. Because these characters are outside of the EBCDIC code sheet, I believe you would still need to use a national character. How do you know those are single byte characters? in UTF-16 for example the é character is double byte. Give the PIC N a try I am pretty certain it will work – SaggingRufus Mar 09 '17 at 11:19

Check Half width Katakana character in COBOL

1 Answers1