0

I am trying to visualize the differences between two distributions (preferably using python). I've plotted the cumulative frequency distributions as well as kernel density estimates: kde, cumulative frequency

But, my audience is not used to looking at graphs like this, so I would like to plot just the difference between the distributions and scale it so it will be more obvious (visually) how they differ. I found this post where the first answer shows a solution to this using R. I have no experience using R so I would like to know if there is a way to implement this using Python. EDIT: I am trying to plot the difference, not just shade the areas that are different. Like int the third plot of the linked answer (abs(ydiff)). Having it included in the original graph, like the "5*difference of density" shaded area in the first plot would be nice, but is not necessary.

Or, if anyone has another idea of how to visually emphasize the difference between two distributions, I'd love to hear it!

TinaG
  • 1
  • 1
  • Use [`fill_between`](https://matplotlib.org/examples/pylab_examples/fill_between_demo.html) – ImportanceOfBeingErnest Jun 24 '17 at 16:44
  • Sorry if I was not clear enough. I am trying to plot the difference like the "shaded area 5*difference of densities" part in the answer I linked to. Or just like in the third graph of that answer (the abs(ydiff)). – TinaG Jun 24 '17 at 16:59
  • Maybe the point is more that it's not clear from the question where exactly the problem lies. So you said you already have the distributions (call them `d1` and `d2`) themselves. You also know how to plot a curve. So if you can plot `d1` and if you can plot `d2`, what is the problem of plotting `d2-d1`? – ImportanceOfBeingErnest Jun 24 '17 at 17:30
  • Ah ok. I will try writing a different question then. I've tried plotting the differences but the result never looks "right" and I'm not sure what exactly the issue is, except that it probably has something to do with the curves not sharing the same x values. Thank you! – TinaG Jun 24 '17 at 17:53
  • If they don't share the same x values, you may need to interpolate either one of the arrays of y values to the other or both to a common array of x positions. This can be done using e.g. numpy.interp. If you have problems with that, you can ask a question with a [mcve] of the code and I'm sure you will get some help here. – ImportanceOfBeingErnest Jun 24 '17 at 18:05
  • Thank you so so much! Using numpy.interp works and fixes my problem! – TinaG Jun 24 '17 at 18:47

0 Answers0