How can one read/write pandas DataFrames (Numpy arrays) of strings in Cython?
It works just fine when I work with integers or floats:
# Cython file numpy_.pyx
@boundscheck(False)
@wraparound(False)
cpdef fill(np.int64_t[:,::1] arr):
arr[0,0] = 10
arr[0,1] = 11
arr[1,0] = 13
arr[1,1] = 14
# Python code
import numpy as np
from numpy_ import fill
a = np.array([[0,1,2],[3,4,5]], dtype=np.int64)
print(a)
fill(a)
print(a)
gives
>>> a = np.array([[0,1,2],[3,4,5]], dtype=np.int64)
>>> print(a)
[[0 1 2]
[3 4 5]]
>>> fill(a)
>>> print(a)
[[10 11 2]
[13 14 5]]
Also, the following code
# Python code
import numpy as np, pandas as pd
from numpy_ import fill
a = np.array([[0,1,2],[3,4,5]], dtype=np.int64)
df = pd.DataFrame(a)
print(df)
fill(df.values)
print(df)
gives
>>> a = np.array([[0,1,2],[3,4,5]], dtype=np.int64)
>>> df = pd.DataFrame(a)
>>> print(df)
0 1 2
0 0 1 2
1 3 4 5
>>> fill(df.values)
>>> print(df)
0 1 2
0 10 11 2
1 13 14 5
However, I am having hard time figuring out how to do the same thing when the input is an array of strings. For example, how can I read of modify a Numpy array or a pandas DataFrame:
a2 = np.array([['000','111','222'],['333','444','555']], dtype='U3')
df2 = pd.DataFrame(a2)
and, let us say, the goal is to change through Cython
'000' -> 'AAA'; '111' -> 'BBB'; '222' -> 'CCC'; '333' -> 'DDD'
I did read the following NumPy documentation page and the following Cython documentation page, but still can not figure out what to do.
Thank you very much for your help!