Questions tagged [cdf]

CDF is an acronym for cumulative distribution function. While the pdf gives the probability density of each value of a random variable, the cdf (often denoted F(x)) gives the probability that the random variable will be less than or equal to a specified value.

A cumulative density function describes the probability that a real-valued random variable X with a given probability distribution will be found at a value less than or equal to x.

The cdf of a discrete random variable is the summation of the probability mass function (pmf) of that distribution. If the random variable is continuous, this turns out to be the integral of the probability density function (pdf).

enter image description here

In applied statistics, cdfs are important in comparing distributions, playing a role in plots (e.g., pp-plots), and hypothesis tests (e.g., the Kolmogorov-Smirnov test).

Strongly related to


Common Data Format

Please pay attention another acronym for CDF is describe in and here is the NASA link for more details.

341 questions
69
votes
11 answers

Plotting CDF of a pandas series in python

Is there a way to do this? I cannot seem an easy way to interface pandas series with plotting a CDF.
wolfsatthedoor
  • 7,163
  • 18
  • 46
  • 90
24
votes
1 answer

Fitting data points to a cumulative distribution

I am trying to fit a gamma distribution to my data points, and I can do that using code below. import scipy.stats as ss import numpy as np dataPoints = np.arange(0,1000,0.2) fit_alpha,fit_loc,fit_beta = ss.rv_continuous.fit(ss.gamma, dataPoints,…
Sahil M
  • 1,790
  • 1
  • 16
  • 31
17
votes
3 answers

Multivariate Normal CDF in Python using scipy

In order to calculate the CDF of a multivariate normal, I followed this example (for the univariate case) but cannot interpret the output produced by scipy: from scipy.stats import norm import numpy as np mean = np.array([1,5]) covariance =…
statBeginner
  • 829
  • 2
  • 9
  • 23
16
votes
2 answers

Logarithmic plot of a cumulative distribution function in matplotlib

I have a file containing logged events. Each entry has a time and latency. I'm interested in plotting the cumulative distribution function of the latencies. I'm most interested in tail latencies so I want the plot to have a logarithmic y-axis. I'm…
nic
  • 1,511
  • 2
  • 14
  • 27
14
votes
1 answer

ggplot scale transformation acts differently on points and functions

I'm trying to plot a distribution CDF using R and ggplot2. However, I am finding difficulties in plotting the CDF function after I transform the Y axis to obtain a straight line. This kind of plot is frequently used in Gumbel paper plots, but here…
AF7
  • 3,160
  • 28
  • 63
9
votes
1 answer

Vectorizing the multivariate normal CDF (cumulative density function) in Python

How can I vectorize the multivariate normal CDF (cumulative density function) in Python? When looking at this post, I found out that there is a Fortran implementation of the multivariate CDF that was "ported" over to Python. This means I can easily…
Felipe D.
  • 1,157
  • 9
  • 19
9
votes
1 answer

R ggplot: Weighted CDF

I'd like to plot a weighted CDF using ggplot. Some old non-SO discussions (e.g. this from 2012) suggest this is not possible, but thought I'd reraise. For example, consider this data: df <- data.frame(x=sort(runif(100)), w=1:100) I can show an…
Max Ghenis
  • 14,783
  • 16
  • 84
  • 132
9
votes
6 answers

Read file and plot CDF in Python

I need to read long file with timestamp in seconds, and plot of CDF using numpy or scipy. I did try with numpy but seems the output is NOT what it is supposed to be. The code below: Any suggestions appreciated. import numpy as np import…
Phani.lav
  • 153
  • 1
  • 1
  • 10
6
votes
3 answers

Getting data out of CDF-player

For my Skeptics working group I wrote a program in Mathematica to test a dowser's ability to assess the status of persons shown to them by means of photographs. For a null measurement I distributed this document to my group's members in CDF form…
Sjoerd C. de Vries
  • 16,122
  • 3
  • 42
  • 94
6
votes
1 answer

How to find the joint cumulative distribution function from a 2-D copula in R?

I am now working on copula in R and I wonder how to find the joint cumulative distribution in R? D = c(1,3,2,2,8,2,1,3,1,1,3,3,1,1,2,1,2,1,1,3,4,1,1,3,1,1,2,1,3,7,1,4,6,1,2,1,1,3,1,2,2,3,4,1,1,1,1,2,2,12,1,1,2,1,1,1,3,4) S =…
Yang Yang
  • 858
  • 3
  • 26
  • 49
6
votes
3 answers

Curve fitting: Find the smoothest function that satisfies a list of constraints

Consider the set of non-decreasing surjective (onto) functions from (-inf,inf) to [0,1]. (Typical CDFs satisfy this property.) In other words, for any real number x, 0 <= f(x) <= 1. The logistic function is perhaps the most well-known example. We…
dreeves
  • 26,430
  • 45
  • 154
  • 229
6
votes
2 answers

How to draw multiple CDF plots of vectors with different number of rows

I want to draw the CDF plot of multiple variables in the same graph. The length of the variables are different. To simplify the detail, I use the following example code: library("ggplot2") a1 <- rnorm(1000, 0, 3) a2 <- rnorm(1000, 1, 4) a3 <-…
Excalibur
  • 431
  • 6
  • 19
5
votes
6 answers

r : ecdf over histogram

in R, with ecdf I can plot a empirical cumulative distribution function plot(ecdf(mydata)) and with hist I can plot a histogram of my data hist(mydata) How I can plot the histogram and the ecdf in the same plot? EDIT I try make something like…
JuanPablo
  • 23,792
  • 39
  • 118
  • 164
5
votes
1 answer

ggplot: adjusting alpha/fill two factors cdf

I'm having some issues getting my ggplot alpha to be sufficiently dark for my plot. Example code: ggplot(mtcars, aes(x=mpg, color=factor(gear), alpha=factor(carb))) + stat_ecdf() As you can see, whenever carb == 1, it's very difficult to see the…
NewRRecruit
  • 507
  • 1
  • 5
  • 7
4
votes
1 answer

Is there a built in Chi square CDF function in C++

I am trying to find a builtin CDF for chi square distribution. Basically, I wish to have a CDF function like pchisq in R, where chisquare(x,p,q) gives you the probability. x is the distribution of the function, p is the dof and q is the…
hao
  • 635
  • 2
  • 8
  • 20
1
2 3
22 23