2

I'm attempting to store IP packet payloads in a PostgreSQL database with Django.

Currently, I'm storying the payload as a CharField.

I'm getting this error:

django.db.utils.DatabaseError: invalid byte sequence for encoding "UTF8": 0xedbc93
HINT:  This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".

Is there any way to sanely store this data? I'm able to do str(packet.payload) with no errors, but when Django tries to save the object it throws the encoding error. A bytestring seems like the obvious solution, but it doesn't look like Django supports that.

rouge8
  • 463
  • 2
  • 5
  • 17

1 Answers1

4

If you want to store arbitrary bytestrings, you should declare them as such. Many (most?) sequences of bytes are not valid UTF-8, so it isn't a good way to store them. A CharField is for storing text, and you don't have text.

The answers to this question will likely be helpful: Django Blob Model Field

Community
  • 1
  • 1
Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • What he said. An IP packet payload is a binary blob. It's neither a string nor is it Unicode. Even if the protocol is 100% Unicode text, it is possible for a packet payload to be invalid Unicode. – Michael Dillon Feb 12 '12 at 22:51