Questions tagged [kernel-density]

kernel density estimation is a non-parametric way to estimate the probability density function of a random variable.

Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample. Kernel density estimates are closely related to histograms, but can be endowed with properties such as smoothness or continuity by using a suitable kernel

http://en.wikipedia.org/wiki/Kernel_density_estimation

656 questions
159
votes
6 answers

How to create a density plot

In R I can create the desired output by doing: data = c(rep(1.5, 7), rep(2.5, 2), rep(3.5, 8), rep(4.5, 3), rep(5.5, 1), rep(6.5, 8)) plot(density(data, bw=0.5)) In python (with matplotlib) the closest I got was with a simple…
unode
  • 9,321
  • 4
  • 33
  • 44
131
votes
4 answers

How can I make a scatter plot colored by density?

I'd like to make a scatter plot where each point is colored by the spatial density of nearby points. I've come across a very similar question, which shows an example of this using R: R Scatter Plot: symbol color represents number of overlapping…
2964502
  • 4,301
  • 12
  • 35
  • 55
42
votes
3 answers

How would one use Kernel Density Estimation as a 1D clustering method in scikit learn?

I need to cluster a simple univariate data set into a preset number of clusters. Technically it would be closer to binning or sorting the data since it is only 1D, but my boss is calling it clustering, so I'm going to stick to that name. The…
37
votes
1 answer

How to plot a 3D density map in python with matplotlib

I have a large dataset of (x,y,z) protein positions and would like to plot areas of high occupancy as a heatmap. Ideally the output should look similiar to the volumetric visualisation below, but I'm not sure how to achieve this with matplotlib.…
nv_wu
  • 1,045
  • 1
  • 13
  • 24
29
votes
3 answers

two-way density plot combined with one way density plot with selected regions in r

# data set.seed (123) xvar <- c(rnorm (1000, 50, 30), rnorm (1000, 40, 10), rnorm (1000, 70, 10)) yvar <- xvar + rnorm (length (xvar), 0, 20) myd <- data.frame (xvar, yvar) # density plot for xvar upperp = 80 # upper cutoff …
SHRram
  • 4,127
  • 7
  • 35
  • 53
18
votes
2 answers

Limit the range of x in seaborn distplot KDE estimation

Suppose we have an array with numbers between 0 and 1: arr=np.array([ 0. , 0. , 0. , 0. , 0.6934264 , 0. , 0. , 0. , 0. , 0. , 0. , 0. …
Ashkan
  • 1,643
  • 5
  • 23
  • 45
18
votes
2 answers

Multivariate kernel density estimation in Python

I am trying to use SciPy's gaussian_kde function to estimate the density of multivariate data. In my code below I sample a 3D multivariate normal and fit the kernel density but I'm not sure how to evaluate my fit. import numpy as np from scipy…
akhil
  • 839
  • 3
  • 8
  • 15
17
votes
3 answers

Add KDE on to a histogram

I would like to add a density plot to my histogram diagram. I know something about pdf function but I've got confused and other similar questions were not helpful. from scipy.stats import * from numpy import* from matplotlib.pyplot import* from…
aaa
  • 161
  • 1
  • 1
  • 8
17
votes
2 answers

KDE is very slow with large data

When I try to make a scatter plot, colored by density, it takes forever. Probably because the length of the data is quite big. This is basically how I do it: xy = np.vstack([np.array(x_values),np.array(y_values)]) z =…
codeKiller
  • 5,493
  • 17
  • 60
  • 115
17
votes
1 answer

How to better fit seaborn violinplots

The following code gives me a very nice violinplot (and boxplot within). import numpy as np import seaborn as sns import matplotlib.pyplot as plt foo = np.random.rand(100) sns.violinplot(foo) plt.boxplot(foo) plt.show() So far so good. However,…
n1000
  • 5,058
  • 10
  • 37
  • 65
17
votes
2 answers

How can I get the value of a kernel density estimate at specific points?

I am experimenting with ways to deal with overplotting in R, and one thing I want to try is to plot individual points but color them by the density of their neighborhood. In order to do this I would need to compute a 2D kernel density estimate at…
Ryan C. Thompson
  • 40,856
  • 28
  • 97
  • 159
16
votes
1 answer

Add density lines to histogram and cumulative histogram

I want to add density curve to histogram and cumulative histogram, like this: Here is as far I can go: hist.cum <- function(x, plot=TRUE, ...){ h <- hist(x, plot=FALSE, ...) h$counts <- cumsum(h$counts) h$density <- cumsum(h$density) …
jon
  • 11,186
  • 19
  • 80
  • 132
15
votes
2 answers

R - How to find points within specific Contour

I am creating density plots with kde2d (MASS) on lat and lon data. I would like to know which points from the original data are within a specific contour. I create 90% and 50% contours using two approaches. I want to know which points are within the…
squishy
  • 344
  • 3
  • 12
15
votes
3 answers

Weighted Gaussian kernel density estimation in `python`

Update: Weighted samples are now supported by scipy.stats.gaussian_kde. See here and here for details. It is currently not possible to use scipy.stats.gaussian_kde to estimate the density of a random variable based on weighted samples. What methods…
Till Hoffmann
  • 9,479
  • 6
  • 46
  • 64
14
votes
1 answer

how does 2d kernel density estimation in python (sklearn) work?

I am sorry for the probably stupid question but I am trying now for hours to estimate a density from a set of 2d data. Let's assume my data is given by the array: sample = np.random.uniform(0,1,size=(50,2)) . I just want to use scipys scikit learn…
murph
  • 203
  • 1
  • 3
  • 7
1
2 3
43 44