6

I tried to fit the following plot(red dot) with the Zipf distribution PDF in Python, F~x^(-a). I simply chose a=0.56 and plotted y = x^(-0.56), and I got the curve shown below.

The curve is obviously wrong. I don't know how to do the curve fitting.

enter image description here

mareoraft
  • 3,474
  • 4
  • 26
  • 62
manxing
  • 3,165
  • 12
  • 45
  • 56

2 Answers2

8

Not sure what you are exactly looking for, but if you want to fit a model (function) to data, use scipy.optimize.curve_fit:

from scipy.optimize import curve_fit
from scipy.special import zetac


def f(x, a):
    return (x**-a)/zetac(a)


result = curve_fit(f, x, y, p0=[0.56])
p = result[0]

print p

If you don't trust the normalization, add a second parameter b and fit that as well.

0

You need an intercept in your loglog plot, right now it is 0.

That frequency follows the inverse rank implies that there is a ratio K between the frequency and the inverse rank, so you need to fit:

F~x^(-a) => F = k/(x^a)

Olsgaard
  • 1,006
  • 9
  • 19