import numpy as np
from scipy.sparse import csr_matrix
csr = csr_matrix(np.array(
[[0, 0, 4],
[1, 0, 0],
[2, 0, 0],]))
# Return a Coordinate (coo) representation of the csr matrix.
coo = csr.tocoo(copy=False)
# Access `row`, `col` and `data` properties of coo matrix.
df = pd.DataFrame({'index': coo.row, 'col': coo.col, 'data': coo.data})[['index', 'col', 'data']]
>>> df.head()
index col data
0 0 2 4
1 1 0 1
2 2 0 2
I tried to convert a scipy csr_matrix matrix to a dataframe, where the columns represent the index, column, and data of the matrix.
The only issue is that what I tried above does not produce rows for the columns where the values are 0. Here is what I'd like the output to look like:
>>> df.head()
index col data
0 0 0 0
1 0 1 0
2 0 2 4
3 1 0 1
4 1 1 0
5 1 2 0
6 2 0 2
7 2 1 0
8 2 2 0
You'll see that the code snippet above is taken from this answer in this thread.
My request/question: Is there a way to convert the matrix to a df and also include the elements of the matrix where the value is 0?