Assume that we have this array in Python:
import pandas as pd
arr = pd.DataFrame(['aabbc','aabccca','aa'])
I want to split each row to columns of its character. The length of the rows may differ. It is the output that I expect to have (3*7 matrix in this case):
1 2 3 4 5 6 7
1 a a b b c Na Na
2 a a b c c c a
3 a a Na Na Na Na Na
The number of the rows of my matrix is 20000 and I prefer not to use for loops
. The original data is protein sequences.
I read [1], [2], [3], etc, and they didn't help me.