"Strings" is not one of the data types that WritableTTree supports. See the blue box under https://uproot.readthedocs.io/en/latest/basic.html#writing-ttrees-to-a-file for a full list.
However, it's possible to write some string-like data. Awkward Arrays of strings are just lists of uint8
type with special metadata (the __array__: "strings"
parameter) indicating that it should be interpreted as a string. There are actually two types, "string"
and "bytestring"
, in which we assume that the former is UTF-8 encoded and the latter is not.
These data can be written to ROOT files by removing the parameters from the array, so that it looks like a plain array of integers:
>>> import awkward as ak
>>> array = ak.Array(["one", "two", "three", "four", "five"])
>>> ak.without_parameters(array)
<Array [[111, 110, 101], ..., [102, 105, 118, 101]] type='5 * var * uint8'>
Here's a way to write these data into a ROOT file:
>>> import uproot
>>> file = uproot.recreate("/tmp/some.root")
>>> file["tree"] = {"branch": ak.without_parameters(array)}
>>> file["tree"].show()
name | typename | interpretation
---------------------+--------------------------+-------------------------------
nbranch | int32_t | AsDtype('>i4')
branch | uint8_t[] | AsJagged(AsDtype('uint8'))
When you read them back, the uint8_t*
array could be cast as a char*
array, but watch out! The strings are not null-terminated (end with a \x00
byte). Many string-interpreting functions in C and C++ won't be expecting that. There are some functions, like strncpy and std::string
's two-argument constructor, that can be given string length information so that they don't look for a null-terminator. The string length information is the counter branch, nbranch
in the above.
I recognize that that's unpleasant. I just opened a feature request on Uproot for writing string data in a natural way, using ROOT's TLeafC
, rather than this hack.