1

I need to generate text data files, both with UTF-8 byte-order-mark and without it. How do I do that?

So far the file has been generated like this:

File(fileName).writeText(source, Charsets.UTF_8)

But this does not provides the possibility to have the BOM on demand.

Note 1:

In this question How to add a UTF-8 BOM in java are using BufferedWriter and PrintStream.print(), but this implies to change the generation of the code to a more Java-oriented way (this is the last option).

Note 2:

In this other question Java: UTF-8 and BOM from 2012, points to a Java Bug that the BOM is not handle. In the comments suggest to never use BOM, but this is not an option in my case because the files are send to different services some of which require it and others don't. Does anybody knows more recent news about this? and if applies to Kotlin?

Alejandro Montilla
  • 2,626
  • 3
  • 31
  • 35

1 Answers1

2

The BOM is a single Unicode character, U+FEFF. You can easily add it yourself, if it's required.

File(fileName).writeText("\uFEFF" + source, Charsets.UTF_8)

The harder part is that the BOM is not stripped automatically when the file is read back in. This is why people recommend not adding a BOM when it's not needed.

Joni
  • 108,737
  • 14
  • 143
  • 193
  • It is not just "people". It is Unicode: do not put BOM in UTF-8, but it is ok to have BOM in UTF-8 if original file had BOM (so to keep such information, if one need to convert back to original charset). – Giacomo Catenazzi Jun 11 '20 at 15:40