Elixir 1.3.0
Windows 10
Postgrex 0.11.2
Ecto 2.0.1
Postgres 9.4.4
I'm attempting to add records to a PostgreSQL database via Ecto. When I get to a string containing \x0087 it throws the following error:
** (Postgrex.Error) ERROR (character_not_in_repertoire): invalid byte sequence for encoding "UTF8": 0x87
I'm pretty sure it's an issue with the file itself which as far as I can tell is encoded as Latin1. This is the code I use to open the file and read it in:
:ok = :io.setopts(:standard_io, encoding: :latin1)
File.open!(file)
|> IO.binstream(:line)
The file opens fine and in fact several lines are processed just fine until it gets to a line that contains \x0087.
What I can't quite figure out is how to convert the line which is read in with latin1 encoding into UTF-8 encoding. I found String.normalize which seems like it might help with the conversion but I know I'm grasping at straws.
I changed the encoding:
parameter on the :io.setopts
line to :utf8
but it doesn't seem to make a difference.
Is there some simple way to convert an ANSI/Latin1 encoded string to a UTF-8 encoded string?