0

My data looks like:

list=[44359, 16610,  8364, ...,     1,     1,     1]

For each element in list I want to take i*([i+1]+[i-1])/2, where i is an element in the list, and i+1 and i-1 are the adjacent elements.

For some reason I cannot seem to do this cleanly in NumPy.

Here's what I've tried:

weights=[]
weights.append(1)
for i in range(len(hoff[3])-1):
    weights.append((hoff[3][i-1]+hoff[3][i+1])/2)

Where I append 1 to the weights list so that lengths will match at the end. I arbitrarily picked 1, I'm not sure how to deal with the leftmost and rightmost points either.

user4261201
  • 2,324
  • 19
  • 26
qdspec
  • 3
  • 2

2 Answers2

0

I would use pandas for this, filling in the missing left- and right-most values with 1 (but you can use any value you want):

import numpy
import pandas

numpy.random.seed(0)
data = numpy.random.randint(0, 10, size=15)

df = (
    pandas.DataFrame({'hoff': data})
        .assign(before=lambda df: df['hoff'].shift(1).fillna(1).astype(int))
        .assign(after=lambda df: df['hoff'].shift(-1).fillna(1).astype(int))
        .assign(weight=lambda df: df['hoff'] * df[['before', 'after']].mean(axis=1))
)
print(df.to_string(index=False)

And that gives me:

hoff  before  after  weight
   5       1      0     2.5
   0       5      3     0.0
   3       0      3     4.5
   3       3      7    15.0
   7       3      9    42.0
   9       7      3    45.0
   3       9      5    21.0
   5       3      2    12.5
   2       5      4     9.0
   4       2      7    18.0
   7       4      6    35.0
   6       7      8    45.0
   8       6      8    56.0
   8       8      1    36.0
   1       8      1     4.5

A pure numpy-based solution would look like this (again, filling with 1):

before_after = numpy.ones((data.shape[0], 2))
before_after[1:, 0] = data[:-1]
before_after[:-1, 1] = data[1:]
weights = data * before_after.mean(axis=1)
print(weights)

array([  2.5,   0. ,   4.5,  15. ,  42. ,  45. ,  21. ,  12.5,   9. ,
        18. ,  35. ,  45. ,  56. ,  36. ,   4.5])
Paul H
  • 65,268
  • 20
  • 159
  • 136
0

You can use numpy's array operations to represent your "loop". If you think of data as bellow, where pL and pR are the values you choose to "pad" your data with on the left and right:

[pL, 0, 1, 2, ..., N-2, N-1, pR]

What you're trying to do is this:

[0, ..., N - 1] * ([pL, 0, ..., N-2] + [1, ..., N -1, pR]) / 2

Written in code it looks something like this:

import numpy as np
data = np.random.random(10)

padded = np.concatenate(([data[0]], data, [data[-1]]))
data * (padded[:-2] + padded[2:]) / 2.

Repeating the first and last value is known as "extending" in image processing, but there are other edge handling methods you could try.

Bi Rico
  • 25,283
  • 3
  • 52
  • 75
  • 1
    Good thinking on the padding. You could do `padded = numpy.pad(data, 1, 'edge')` in place of your concatenation – Paul H Jan 03 '18 at 18:24
  • Fantastic, I ended up using a slight variant of what you've given here. Also appreciate the edge case handling discussion – qdspec Jan 03 '18 at 21:48