6

I have an UTF string with \0 character and text field in a sqlite table.
When I tried to insert the string into table text field and then read it from the database I noticed that string value was truncated after \0 character.

Question: Is it possible to so save/restore such strings in sqlite without losing data after \0?

The code snippet:

 public static void IssueWith0Character()
    {
        const string sql = "DROP TABLE IF EXISTS SomeTable;" +
                           "CREATE TABLE SomeTable (SomeField TEXT not null);"
                           + "INSERT INTO SomeTable (SomeField) Values ( :value )";

        var csb = new SQLiteConnectionStringBuilder
                      {DataSource = "stringWithNull.db", Version = 3};

        // string with '0' character
        const string stringWithNull = "beforeNull\0afterNull";

        using (var c = new SQLiteConnection(csb.ConnectionString))
        {
            c.Open();

            using (var cmd = c.CreateCommand())
            {
                var p = new SQLiteParameter(":value", DbType.String) {Value = stringWithNull};
                cmd.CommandText = sql;
                cmd.Parameters.Add(p);
                cmd.ExecuteNonQuery();
            }

            using (var cmd = c.CreateCommand())
            {
                cmd.CommandText = "SELECT SomeField FROM SomeTable;";
                var restoredValue = (string) cmd.ExecuteScalar();
                Debug.Assert(stringWithNull == restoredValue);
            }
        }
    }    

UPDATE #1 It looks like problem is on reading stage. At least "afterNull" part of a string exists in the database file.

UPDATE #2 That was considered as System.Data.SQLite bug (<1.04.84). http://system.data.sqlite.org/index.html/tktview/3567020edf12d438cb7cf757b774ff3a04dc381e

  • 1
    I think that's what `BLOB` is for. – Joachim Isaksson Mar 05 '13 at 17:58
  • When the string gets passed to sqlite, it probably treats the '\0' as a string terminator. Could you instead use '\\0'? – Kyle Mar 05 '13 at 18:04
  • Joachim, thanks. I almost sure that using BLOB will resolve persistance problem. May be that is an only option for me. In my case most of time I have just human readable strings without \0s. Quite rarely I have human readable strings with \0. Аfter introducing BLOB it will be necessary to do extra ToBytes/FromBytes string convertions as well as it will be harder to introduce search functionality in my application. – Vladimir Datsyuk Mar 05 '13 at 21:06
  • Zenox, the question is why \0 is treated as a string terminator for saving UTF strings in sqlite. Sqlite uses UTF to store text. \0 is just first character from huge UTF table. I hope all other symbols can be saved without any issues. – Vladimir Datsyuk Mar 05 '13 at 21:20
  • There is sqlite3_bind_text function in sqlite API that is actually called by Sqlite data adapter. Here http://www.sqlite.org/c3ref/bind_blob.html it is mentioned that if paremeters are passed correctly then cite:[ If any NUL characters occur at byte offsets less than the value of the fourth parameter then the resulting string value will contain embedded NULs] – Vladimir Datsyuk Mar 05 '13 at 21:26
  • Does this answer your question? [SQLite strings with NUL](https://stackoverflow.com/questions/4049348/sqlite-strings-with-nul) – iammilind Feb 13 '20 at 06:19

1 Answers1

-2

In SQLite, \0 characters are considered invalid.

While it is possible to put such strings into the database (using the pointer+length form of various functions), many functions that operate on strings stop when encountering the \0. Therefore, the documentation says:

The result of expressions involving strings with embedded NULs is undefined.

If your really need to store data with null bytes, you should store it as a blob (DbType.Binary).

CL.
  • 173,858
  • 17
  • 217
  • 259
  • Thanks for your reply. Looks like I am supposed to use BLOB/DbType.Binary in that situation. However it seems strange for me that \0 is not valid character in UTF. I think it is just first character in any UTF table. It is also 100% valid in .NET strings. So at least for me it is looks strange that default string encoding in sqlite is UTF8 but it impossible to use one character from UTF8 table (even if that character is used as special one in C strings). – Vladimir Datsyuk Mar 05 '13 at 20:36
  • \0 characters are valid in UTF-encoded strings, and SQLite does allow you to store them. U+0000 is \0, which is a valid UTF-8 character http://en.wikipedia.org/wiki/UTF-8 . See https://www.sqlite.org/c3ref/bind_blob.html for more details on the second part. – ryantm Oct 07 '14 at 17:51
  • @ryantm "The result of expressions involving strings with embedded NULs is undefined." – CL. Oct 07 '14 at 19:23
  • I'm not totally sure what that means, but my guess is that it is talking about query expressions https://www.sqlite.org/lang_expr.html . When I put \0 into those, it treats it like the end of the query is at that character (usually a syntax error when in the middle of a string). When I use bind (either text or blob), it accepts \0, which is consistent with the sentence previous to the one you quoted: "If any NUL characters occur at byte offsets less than the value of the fourth parameter then the resulting string value will contain embedded NULs." – ryantm Oct 08 '14 at 16:29