0

Please consider the following python code

import matplotlib.pyplot as plt
import numpy as np

#create some data to plot.

dt = 0.001
t = np.arange(0.0,100,dt)
r = np.exp(-t[:1000]/0.05)
x = np.random.randn(len(t))
s = np.convolve(x,r)[:len(x)]*dt

The code compiles and runs and I largely understand what it is doing. However, I am confused about the code '[:len(x)]' is actually doing. If I truncate 's' to 'np.convolve(x,r)*dt', the code fails to compile and there is an error message from 'base.py' as follows:

"raise ValueError(f"x and y must have same first dimension, but " ValueError: x and y must have same first dimension, but have shapes (100000,) and (100999,)"

What is '[:len(x)]' actually doing and is there something in the language documentation that gives some examples of this sort of context ?

Thanks.

All the objects are of type 'ndarray'. t is length 100000 t is of shape (100000,)

r is length  1000
r is of shape  (1000,) 

x is length  100000
x is of shape  (100000,) 

s is length  100999
s is of shape  (100999,) 
David
  • 95
  • 9
  • 2
    The `[:len(x)]` just says "from the results of the `np.convolve` call, take only the first `len(x)` elements, multiply them by `dt` and store the resulting vector in `s`. – Tim Roberts Apr 09 '21 at 05:41
  • 2
    You have the exact same concept 2 lines up: `-t[:1000]`, which takes the first 1,000 elements of `t` and uses them in the computation. It's the same thing. It's called `slice notation`. – Tim Roberts Apr 09 '21 at 05:42
  • So for, 'plt.plot(t,s)' ndarray objects 't' and 's' must have the same shape ? That is of 100000 ? Otherwise the shapes would be 100000 and 100999 respectively ? Length and shape seem to be coincident in this context. – David Apr 09 '21 at 05:49
  • 1
    Yes. `t` and `s` in that case are the x and y coordinates for your plot. They had better have the same cardinality. Can't have x's without y's. – Tim Roberts Apr 09 '21 at 05:53
  • That's great thanks. It seems obvious now from a graphing pov but the code wasn't :-( – David Apr 09 '21 at 05:55

1 Answers1

1

If we read the docs for np.convolve, we see that with the default parameters, it returns an array that is one shorter than the sum of the lengths of the input array. That is if you call np.convolve(a, b), and len(a) = A and len(b) = B, the output is length A + B - 1.

This is because a convolution can be interpreted as integrating the product of two functions, with one of the functions shifted relative to the other. By default, np.convolve calculates this convolution for all points at which these functions overlap, so the length of the output is approximately the sum of the lengths of the input functions. In your case, x has length 100,000, and r has length 1,000, so the output length is 100,000 + 1,000 - 1 = 100,999.

You can change this behaviour with the mode parameter, so that np.convolve truncates the output automatically, but neither of the alternate options seem to match your use case. You could try supplying mode = same, which ensures the output is the same length as the longest input, and see what happens for your own interest though.

Since t - length 100,000 - and s need to be the same length so you can plot (I assume) s(t), you need to truncate the output s to a length of 100,000 to match.

This is what the notation [:len(x)] does. This is called "slice" notation, and the gist is that A[start:stop] allows you to select the subset of values in A from start (inclusive) to stop (exclusive). If you don't supply a start or end, it defaults to the start or end of the array respectively. So [:len(x)] picks from 0 to len(x) (exclusive) which gives you an array of length len(x). This ensures len(s) = len(x).

Jared
  • 86
  • 7
  • 1
    Thanks for your most thoughtful response. It forced me to understand the mathematical process of convolution more than I remember. I can see the 'mode' parameter comes into its own right when considering potential boundary effects when one function is gating the other. Thanks again. – David Apr 13 '21 at 01:05