My first idea is this:
In [1380]: x=sparse.coo_matrix([0,0,1,1,0,1,0,0,0,1])
In [1381]: ''.join([str(i) for i in x.A.ravel().tolist()])
Out[1381]: '0011010001'
Not necessarily better, but I think it illustrates some key issues. Is it useful to work with the sparse matrix or dense? How do you convert integers to strings? Is this a 1 row matix?
I can improve the string conversion with astype
:
In [1393]: ''.join(x.A.ravel().astype('U1'))
Out[1393]: '0011010001'
join
is performing a list iteration on the array.
With bytestrings (PY3, or normal string in PY2), tostring
is an alternative to join
. This just returns the databuffer as a string:
In [1451]: x.A.astype('S1').tostring()
Out[1451]: b'0011010001'
I can use the astype
on the sparse matrix, but there's a bug preventing me from making that dense:
In [1397]: x.astype('U1').A
...
ValueError: unsupported data types in input
===========================
A variation on your iteration; start with a 0
string; make it a list; sparse.nonzero
just returns the .col
values from the coo
format matrix.
In [1403]: ll # ll = '0'*x.shape[1]
Out[1403]: '0000000000'
In [1404]: ll=list(ll)
In [1405]: ll
Out[1405]: ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
In [1406]: for i in x.col:
...: ll[i]='1'
...:
In [1407]: ll
Out[1407]: ['0', '0', '1', '1', '0', '1', '0', '0', '0', '1']
In [1408]: ''.join(ll)
Out[1408]: '0011010001'
Or doing the same thing with a string array:
In [1416]: ll=np.empty(x.shape[1], dtype='U1')
In [1417]: ll.fill('0')
In [1418]: ll
Out[1418]:
array(['0', '0', '0', '0', '0', '0', '0', '0', '0', '0'],
dtype='<U1')
In [1419]: ll[x.col]='1'
In [1420]: ll
Out[1420]:
array(['0', '0', '1', '1', '0', '1', '0', '0', '0', '1'],
dtype='<U1')
This avoids the loop(s) since I can assign multiple values at once.
For this small example, the list solutions might be just as fast or faster. Array versions have some array creation overhead, so they are best if the case is large.
Even for a coo
matrix with (1,1210) shape, this list iterative version is noticeably faster:
def foo1(x):
ll=list('0'*x.shape[1]) # ll=['0']*x.shape[1] is little faster
for i in x.col:
ll[i]='1'
return ''.join(ll)
If the matrix is not coo
, either convert it, x.tocoo()
or use x.nonzero()
(but look at its code).
=========
I've ignored your None
test. Why is that there? It could be dangerous
In [1448]: x.nonzero()[1] != None
/usr/local/bin/ipython3:1: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future.
#!/usr/bin/python3
Out[1448]: True