Is it possible to somehow load an array with a text field of unknown field length?
I figured out how to pass dtype
to get string into it. However, with out specifying length i just get U0
. Type which seems not to be able to hold any data. E.g:
data = io.StringIO("test data lololol\ntest2 d4t4 ololol")
>>> ar = numpy.loadtxt(data, dtype=[("1",str), ("2",'S'), ("3",'S')])
>>> ar
array([('', b'', b''), ('', b'', b'')],
dtype=[('1', '<U0'), ('2', '|S0'), ('3', '|S0')])
When I change to mode with specified size I get input:
>>> data.seek(0)
0
>>> numpy.loadtxt(data, dtype=[("1",(str,30)), ("2",(str,30)), ("3",('S',30))])
array([("b'test'", "b'data'", b'lololol'),
("b'test2'", "b'd4t4'", b'ololol')],
dtype=[('1', '<U30'), ('2', '<U30'), ('3', '|S30')])
I'd be fine with either S
or U
probably. The field in my case is supposed to be used to hold set of textual flags. Something like linux environmental variables. Thus, preallocating large space just in case seems like a big waste. Especially when number of rows goes into millions.
I do understand, or have ideas, where such design can come from. Like constructing a struct
like object that holds whole row in continuous memory block. However, I thought maybe there could a way to make it keep like a pointer in case of strings.
Is it possible?