0

Please I need to convert text file from utf8 to cp1251. And I cant use any third party software. Is there any routine written in COBOL for that? It's Micro Focus Cobol on Windows.

Pavel Matras
  • 329
  • 1
  • 5
  • 13
  • It's a simple read/write program. Then look to the various documents on the Micro Focus website for how to do Unicode to a Code Page. – Bill Woodger Sep 26 '16 at 16:36
  • "I can't use any third party software?" Yet you are proposing to write yet-another-tool which looks to me an awful lot like third party software. – Ira Baxter Sep 26 '16 at 19:39
  • Although it may count as third-party, open the file in a text editor, like TextPad or NotePad++ or Crimson Editor, or..., and save to a different encoding. If you need to tell management "it was written in COBOL", then `CALL "SYSTEM" USING "a-batchfile-invoking-a-scriptable-editor-with-this-filename"` – Brian Tiffin Sep 27 '16 at 03:47

2 Answers2

5

Answer: there are lots of COBOL routine written for that...

I don't know any free (=open source with the freedom to actually use it) implementation but you can easily write it on your own. Just go thru the source and move it to the target, if the sign is not available in cp1251 use a '?' or whatever. The only work here: you need to lookup the 128 characters from x'80' and above...

Or you check if MF has some specific extensions or you write it on your own. There is no "please code this for me" at SO, so you should show what you've tried already.

To get you an idea have a look at the conversion of this javascript sample, should be something like (untested code):

       77  utf-8-field     PIC X(5000).
       77  new-char        PIC X.
       77  cp1251-field    PIC X(5000).
       77  utf-8-pos       PIC 9(04) COMP-5.
       77  cp1251-pos      PIC 9(04) COMP-5.
       77  utf-8-end       PIC 9(04) COMP-5.

       MOVE FUNCTION LENGTH ( FUNCTION TRIM (utf-8-field TRAILING) )
         TO utf-8-end
       MOVE 1 TO cp1251-pos
       PERFORM VARYING utf-8-pos FROM 1 BY 1
               UNTIL   utf-8-pos = utf-8-end
          EVALUATE TRUE
             *> normal ASCII character
             WHEN utf-8-field (utf-8-pos) < x'80'
                MOVE utf-8-field (utf-8-pos) TO new-char
             *> UTF-8 in CP1251 range
             WHEN utf-8-field (utf-8-pos) < x'04'
                *> skip the first byte
                ADD 1 TO utf-8-pos
                EVALUATE TRUE
                   WHEN utf-8-pos > utf-8-end
                      MOVE '?'   TO new-char
                   WHEN utf-8-field (utf-8-pos)  = x'51'
                      MOVE x'B8' TO new-char
                   WHEN utf-8-field (utf-8-pos) >= x'4F'
                      MOVE '?'   TO new-char
                   *> alternative: use alphabet conversion here
                   WHEN utf-8-field (utf-8-pos)  = x'01'
                      MOVE x'A8' TO new-char
                   WHEN OTHER
                      MOVE utf-8-field (utf-8-pos) TO new-char
                      INSPECT new-char CONVERTING x'0203 ...
                                       TO         x'B2B2 ...
                END-EVALUATE
             *> UTF-8 with no CP1251 char 
             *> Todo: check for other multibyte headers and add the correct
             *>       number of characters to utf-8-pos
             *> WHEN ...
             WHEN OTHER
                MOVE '?' TO new-char
          END-EVALUATE
          STRING new-char
                 DELIMITED BY SIZE
                 INTO cp1251-field
                 WITH POINTER cp1251-pos
          END-STRING
       END-PERFORM

You may want to define an ALPHABET for the CONVERTING x'0203 ... TO x'B2B3 ... part:

       SPECIAL-NAMES.
          ALPHABET UTF8-PART-2 IS x'01', x'02' THRU x'4F', x'51'.
          ALPHABET CP1251      IS x'A8', x'B2' THRU x'FF', x'B8'.

and in the inner EVALUATE use

           MOVE utf-8-field (utf-8-pos) TO new-char
           INSPECT new-char CONVERTING UTF8-PART-2 TO CP1251
Community
  • 1
  • 1
Simon Sobisch
  • 6,263
  • 1
  • 18
  • 38
0

Have you looked @ CBL_STRING_CONVERT?

Stephen Gennard
  • 1,910
  • 16
  • 21