8

I would like to use SciPy's deconvolve function to find an unknown distribution given two Gaussian distribions. There is no documentation associated with this function in SciPy, so I'm just looking for an example as to how this function can be used in my situation. For example, given two normal distributions N(100, 1), N(300, 2), I would like to understand how I can find the distribution of the deconvolution N(200, 1).

>>> sample1 = np.round(scipy.around(scipy.stats.norm(100, 1).rvs(size=1000)))
>>> sample2 = np.round(scipy.stats.norm(300, 2).rvs(size=2000))
>>> signal.deconvolve(sample1, sample2)

The above code gives me negative values, which seems wrong. How can I recover the distribtion N(200, 1) from this deconvolution? In particular, I think my problem is that I do not understand how to get the divisor.

What I would really like is to see is an example of how I can recover ~ N(200, 1) from these samples using SciPy's deconvolution.

gypaetus
  • 6,873
  • 3
  • 35
  • 45
user1728853
  • 2,607
  • 4
  • 30
  • 31
  • 2
    You're right that it's not well documented, but you can [see the source](https://github.com/scipy/scipy/blob/v0.11.0/scipy/signal/signaltools.py#L648) and then look at the documentation for [`convolve`](http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.convolve.html) – askewchan Mar 18 '13 at 18:17
  • 1
    Thanks, I've tried to understand it this way, but I can't seem to understand. An example would be extremely helpful, so I can understand. – user1728853 Mar 18 '13 at 18:23

1 Answers1

5

I think you are a little confused about your expectations... Since we all know that the convolution of two normal distributions is another normal distribution with mean the sum of the means, and variance the sum of the variances, you seem to expect that the convolution of two normal random samples will also be a normal random sample. And that just ain't so:

a = scipy.stats.norm(100, 1).rvs(size=1000)
b = scipy.stats.norm(200, 1).rvs(size=1000)
c = scipy.convolve(a, b)
plt.subplot(311)
plt.hist(a, bins=50)
plt.subplot(312)
plt.hist(a, bins=50)
plt.subplot(313)
plt.hist(a, bins=50)

enter image description here

You were probably thinking of something along the lines of:

a = scipy.stats.norm(100, 10).pdf(np.arange(500))
b = scipy.stats.norm(200, 20).pdf(np.arange(500))
c = scipy.convolve(a, b)
m_ = max(a.max(), b.max(), c.max())
plt.subplot(311)
plt.axis([0, 1000, 0, 1.25*m_])
plt.plot(a)
plt.subplot(312)
plt.axis([0, 1000, 0, 1.25*m_])
plt.plot(b)
plt.subplot(313)
plt.axis([0, 1000, 0, 1.25*m_])
plt.plot(c)

enter image description here

In any case, getting back to deconvolve... If you call it with two arrays of length m and n, it will return you a tuple with two arrays:

  • the first of length m - n + 1 is the deconvolved array, i.e. the array you should convolve the second one with, to get the first
  • the second of length m is the error in replacing the first array with the convolution of the second with the first returned one.
Jaime
  • 65,696
  • 17
  • 124
  • 159
  • Jaime, thanks for the great answer. How can I use the arrays returned from `deconvolve()` to recover my signal? From you explanation, I'm calling `deconvolve(c, a)`, this gives me an array `m` with negative numbers? I should mention my data is discrete (corrected in the question above). – user1728853 Mar 18 '13 at 19:42
  • @user1728853 When you call `deconvolve(c, a)` the return is `(b, e)`. `b` is your signal, `e` is the error, i.e. `convolve(a, b) = c - e`. If `c` is the result of convolving `a` with something, then the errors will be small, and you can take `b` as your signal. In your example, `sample2` is not something you would get from convolving `sample1` with anything, hence you get large errors and the return of `deconvolve` is basically meaningless. – Jaime Mar 18 '13 at 19:48
  • If I have distributions N(100, 1) and N(200, 1) their convolution is N(300, 2). Isn't the deconvolution of N(300, 2) and N(100, 1) = N(200, 1)? I guess, I'm just trying to see a working example of deconvolution with 2 distributions. – user1728853 Mar 18 '13 at 19:56
  • @user1728853 The convolution of the **probability density functions** of N(100, 1) and N(200, 1) is the **probability density function** of N(300, 2). But the convolution of a **random sample** from N(100, 1) with a **random sample** from N(200, 1) **is not** a **random sample** from N(300, 2). That's why your example doesn't work. – Jaime Mar 18 '13 at 20:01
  • Thanks for your patients. I really appreciate the help. When I convert my discrete data into PMFs, then deconvolve the PMFs, I still get an empty m array and an n array with negative numbers. This seems wrong still? Shouldn't I get an m array with ~ N(200, 1)? – user1728853 Mar 18 '13 at 20:09
  • Somehow this makes all sense, but it seems that deconvolve still does not act as expected. I did ask a slightly different question (http://stackoverflow.com/questions/40615034/understanding-scipy-deconvolve), so if anyone has an answer to that as well.. – ImportanceOfBeingErnest Nov 15 '16 at 18:02