2

I'm having trouble printing a Unicode string using the wprintf function in FASM (Flat Assembler).

I've tried the following code, but it produces random output (हिन्ि):


format PE64 console
entry start

include './include/win64w.inc'
include './include/macro/proc64.inc'
include './include/encoding/utf8.inc'

;======================================
section '.data' data readable writeable
;======================================
;unicode for हिन्दी
wunicode_string dw  0xe0, 0xa4, 0xb9, 0xe0, 0xa4, 0xbf, 0xe0, 0xa4, 0xa8, 0xe0, 0xa5, 0x8d, 0xe0, 0xa4, 0xbf



;=======================================
section '.code' code readable executable
;=======================================

start:
    
    mov rax, 0
    ccall [wprintf], "%ls", wunicode_string
   

    ccall   [getchar]                   ; I added this line to exit the application AFTER the user pressed any key.
    stdcall [ExitProcess],0             ; Exit the application

;====================================
section '.idata' import data readable
;====================================

library kernel,'kernel32.dll',        msvcrt,'msvcrt.dll'

import  kernel,        ExitProcess,'ExitProcess'

import  msvcrt,        printf,'printf', wprintf, 'wprintf',       getchar,'_fgetchar'

  • What if you change the string to `wunicode_string dw 0x41, 0x42, 0x43, 0x44` (aka ABCD)? – David Wohlferd Jun 03 '23 at 06:17
  • @DavidWohlferd then it works – Shikhar Mishra Jun 03 '23 at 06:30
  • So it's probably printing exactly the characters you are telling it to print. What do you expect to see? – David Wohlferd Jun 03 '23 at 06:34
  • I want to see हिन्दी whose unicode I got from [this](https://www.coderstool.com/unicode-text-converter) website – Shikhar Mishra Jun 03 '23 at 06:38
  • Ahh. Sorry, it's right there in your post. Unicode characters are 16 bits long, so using `dw` is correct. But then it looks like you are only specifying 8 bits: `0xe0`. Should that be `0xe0a4`? Or maybe `0xa4e0`? – David Wohlferd Jun 03 '23 at 06:46
  • 1
    I don't think I have the right code pages installed to display it here, but perhaps `wunicode_string du "हिन्दी"`? Also, [this](https://stackoverflow.com/q/388490/2189500) seems to suggest that while the Windows *console* supports everything unicode, the c *runtime* via stdout? Not so much. – David Wohlferd Jun 03 '23 at 07:08
  • `wunicode_string du "हिन्दी"` shows blank output, interestingly all the unicode containing English characters whether UTF-8 or 16 are showed correctly. The same thing in C code works flawlessly on the same console. Should I try WinAPI? – Shikhar Mishra Jun 03 '23 at 07:24
  • 1
    I've tried using WriteConsoleW, but it didn't help. But my C code doesn't show the string in a console window either so maybe it will work for you? – David Wohlferd Jun 03 '23 at 08:05
  • in C it works by default with printf using below code `#include ` `#include ` `char *s = "the हिन्दी string";` `int main(){` `// set console code page to utf-8` `SetConsoleOutputCP(65001);` `printf("%s\n",s);` `return 0;` }` In FASM I got it to work using MessageBoxW but still no luck with WriteConsoleW. Is there any way to `SetConsoleOutputCP(65001);`using FASM – Shikhar Mishra Jun 03 '23 at 09:09
  • I got the answer (I don't have enough reputation to answer my own question, somebody answer it I will mark it as answer) So you have to import SetConsoleOutputCP from kernel32 lib and then `ccall [SetConsoleOutputCP], 65001` and it will work – Shikhar Mishra Jun 03 '23 at 09:25
  • 1
    I think you have enough reputation to self answer now. – Michael Petch Jun 03 '23 at 16:31

1 Answers1

2

It shows correct output if we set correct code page for the console which we can do in FASM (will work in NASM too) like below:

format PE64 console
entry start

include './include/win64a.inc'
include './include/macro/proc64.inc'


;=======================================
section '.code' code readable executable
;=======================================

start:

    ccall   [SetConsoleOutputCP], 65001
    ccall   [printf], "%s", "हिन्दी"


    ccall   [getchar]                   ; I added this line to exit the application AFTER the user pressed any key.
    stdcall [ExitProcess], 0            ; Exit the application

;====================================
section '.idata' import data readable
;====================================

library kernel,'kernel32.dll',\  
        msvcrt,'msvcrt.dll'

import  kernel,\  
        ExitProcess,'ExitProcess',\  
        SetConsoleOutputCP, 'SetConsoleOutputCP'

import  msvcrt,\  
        printf,'printf',\  
        getchar,'_fgetchar'

I've used printf but it will work for wprintf too.

Sep Roland
  • 33,889
  • 7
  • 43
  • 76