I am writing a function that pulls the top x values from a sparse vector (fewer values if there are less than x). I would like to include an "in-place" option like many functions have, where it removes the top values if the option is True, and keeps them if the option is False.
My issue is that my current function is overwriting the input vector, rather than keeping it as is. I am not sure why this is occurring. I expected the way to solve my design problem would be to include an if statement which would copy the input using copy.copy(), but this raises a value error (ValueError: row index exceeds matrix dimensions) which does not make sense to me.
Code:
from scipy.sparse import csr_matrix
import copy
max_loc=20
data=[1,3,3,2,5]
rows=[0]*len(data)
indices=[4,2,8,12,7]
sparse_test=csr_matrix((data, (rows,indices)), shape=(1,max_loc))
print(sparse_test)
def top_x_in_sparse(in_vect,top_x,inplace=False):
if inplace==True:
rvect=in_vect
else:
rvect=copy.copy(in_vect)
newmax=top_x
count=0
out_list=[]
while newmax>0:
newmax=1
if count<top_x:
out_list+=[csr_matrix.max(rvect)]
remove=csr_matrix.argmax(rvect)
rvect[0,remove]=0
rvect.eliminate_zeros()
newmax=csr_matrix.max(rvect)
count+=1
else:
newmax=0
return out_list
a=top_x_in_sparse(sparse_test,3)
print(a)
print(sparse_test)
My question has two parts:
- how do I prevent this function from overwriting the vector?
- how do I add the inplace option?