0

Assume that we have the following pandas dataframe:

test_df =  pd.DataFrame({'start':[1,2,3,4], 'end':[2,3,4,5] ,'signal':[1,2,3,4]},columns=['start','end','signal'])

Can we update a numpy array in a vectorized way?

nparray = np.zeros(4)

Using the compute method below?

def compute(nparray,start,end,signal):
    nparray[start:end] += signal

Right now, it gives the following error:

    nparray[start:end] += signal
TypeError: slice indices must be integers or None or have an __index__ method
burcak
  • 1,009
  • 10
  • 34

1 Answers1

1

1st Create your range , then make the range become a list , then the problem become a unnesting problem

df['key']=[list(range(x,y))for x , y in zip(df.start,df.end)]
unnesting(df,['key']).groupby('key').signal.sum()
key
1    1
2    2
3    3
4    4
Name: signal, dtype: int64

unnesting(df, ['key']).groupby('key').signal.sum().values
array([1, 2, 3, 4], dtype=int64)
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Do you mean this function? How does it work? def unnesting(df, explode): idx=df.index.repeat(df[explode[0]].str.len()) df1=pd.concat([pd.DataFrame({x:np.concatenate(df[x].values)} )for x in explode],axis=1) df1.index=idx return df1.join(df.drop(explode,1),how='left') – burcak Feb 02 '19 at 03:40
  • @burcak yes : -) that is the function – BENY Feb 02 '19 at 04:09
  • Assume that `test_df = pd.DataFrame({'start':[5,10], 'end':[7,20] ,'signal':[1,2]},columns=['start','end','signal'])` Can we get a numpy array of `[1 1 0 0 0 2 2 2 2 2 2 2 2 2 2]` instead of `[1 1 2 2 2 2 2 2 2 2 2 2]` – burcak Feb 02 '19 at 07:01
  • Also, I have one more question, in this unnesting function is efficient for very big dataframes `def unnesting(df, explode): idx=df.index.repeat(df[explode[0]].str.len()) df1=pd.concat([pd.DataFrame({x:np.concatenate(df[x].values)} )for x in explode],axis=1) df1.index=idx return df1.join(df.drop(explode,1),how='left') ` – burcak Feb 02 '19 at 07:04
  • @burcak I think it can do with big data frame . But I am not sure about your additional case – BENY Feb 02 '19 at 17:48
  • Assume that we have the following dataframe `test_df = pd.DataFrame({'start':[5,7,15], 'end':[10,12,20] ,'signal':[1,2,3]},columns=['start','end','signal']) ` I want to come up with an numpy array with `[0 0 0 0 0 1 1 3 3 3 2 2 0 0 0 3 3 3 3 3]` Is that possible? – burcak Feb 02 '19 at 18:07
  • 1
    @burcak `unnesting(df,['key']).groupby('key').signal.sum().reindex(list(range(max(df.end)))).fillna(0) ` – BENY Feb 02 '19 at 21:31