1

I'm trying to fit a function f(x,y,z) with the following quadratic polynomial:

3d polynomial

Some distorted spherical surface in three dimensions. The problem is related to the calculation of effective masses in solid state physics.

Here is a picture of the data to show that it indeed falls off parabolically in all directions, even though the curvature in the z-direction is rather low: 3d parabolas

I'm interested in the coefficients, which correspond to effective masses. I've got an array of xyz coordinates, which is regular and centered on the maximum:

[[ 0.          0.          0.        ]
 [ 0.          0.          0.01282017]
 [ 0.          0.          0.02564034]
 ...
 [-0.05026321 -0.05026321 -0.03846052]
 [-0.05026321 -0.05026321 -0.02564034]
 [-0.05026321 -0.05026321 -0.01282017]]

And a corresponding 1D array of scalar values, one for each point. The number of data points around this maximum can range from 100 to 1000.

This is the code I'm currently trying to use for fitting:

def func(data, mxx, mxy, mxz, myy, myz, mzz):
    x = data[:, 0]
    y = data[:, 1]
    z = data[:, 2]
    return (
        (1 / (2 * mxx)) * (x ** 2)
        + (1 / (1 * mxy)) * (x * y)
        + (1 / (1 * mxz)) * (x * z)
        + (1 / (2 * myy)) * (y ** 2)
        + (1 / (1 * myz)) * (y * z)
        + (1 / (2 * mzz)) * (z ** 2)
    ) + f(0, 0, 0)

energy = data[:, 3]
guess = (mxx, mxy, mxz, myy, myz, mzz)
params, pcov = scipy.optimize.curve_fit(
    func, data, energy, p0=guess, method="trf"
)

Where f(0,0,0) is the value of the function at (0, 0, 0), which I retrieve with the scipy.interpolate.griddata function.

For this problem, the masses should be negative and have values between -0.2 and -2, roughly speaking. I'm creating guess values through a finite difference differentiation.

However, I don't get any senseful results from scipy.interpolate.curve_fit - typically the coefficients end up with huge numbers (like 1e9). I'm completly lost at this point.

What am I doing wrong :( ?

roman
  • 123
  • 9
  • The docs of `curve_fit` say: `curve_fit( f, xdata, ydata, ...)` assume `ydata = f(xdata)`, so I'd say you have a problem passing your data properly. – mikuszefski Jan 27 '20 at 09:08
  • ..and why `trf`? – mikuszefski Jan 27 '20 at 09:11
  • I've been using method="trf" because I was experimenting with bounds, which didn't work out. The data array contained [x, y, z, energy] values, so the ydata corresponded to data[:, 3] – roman Jan 27 '20 at 15:05

1 Answers1

0

One of the problems is that you fit 1/m. While this is correct from a physics point of view, it is bad from the algorithm point of view. If the fitting algorithm needs to change sign for values of m near zero, the coefficients diverge. Consequently, it is better to fit mI = 1/m and make the according error progressions later. Here I use leastsqwhich requires some additional calculations for the covariance matrix (as it returns the reduced form). I do the fit with g() and the inverse masses, but you can immediately reproduce your problems when introducing f() and directly fitting the ms.

A second point is that the data has an offset, i.e. if x = y = z = 0 the data is v= -0.0195 This needs to be introduced into the model.

Finally, I'd say that you already have non-parabolic behaviour in your data.

Nevertheless, here is how it looks like:

import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(linewidth=300)

from scipy.optimize import leastsq
from scipy.optimize import curve_fit

data = np.loadtxt( "silicon.csv", delimiter=',' )

def f( x, y, z, mxx, mxy, mxz, myy, myz, mzz, offI ):
    out = 1./(2 * mxx) * x * x
    out += 1./( mxy ) * x * y
    out += 1./( mxz ) * x * z
    out += 1./( 2 * myy ) * y * y
    out += 1./( myz ) * y * z
    out += 1./( 2 * mzz ) * z * z
    out += 1./offI
    return out

def g( x, y, z, mxxI, mxyI, mxzI, myyI, myzI, mzzI, off ):
    out = mxxI / 2 * x * x
    out += mxyI  * x * y
    out += mxzI * x * z
    out += myyI / 2 * y * y
    out += myzI  * y * z
    out += mzzI / 2  * z * z
    out += off
    return out


def residuals( params, indata ):
    out = list()
    for x, y, z, v in indata:
        out.append( v - g( x,y, z, *params ) )
    return out

sol, cov, info, msg, ier = leastsq( residuals,  7*[0], args=( data, ), full_output=True)

s_sq = sum( [x**2 for x in residuals( sol, data) ] )/ (len( data ) - len( sol ) )
print "solution"
print sol

masses = [1/x for x in sol]
print "masses:"
print masses

print "covariance matrix:"
covMX = cov * s_sq
print covMX

print "sum of residuals"
print sum( residuals( sol, data) )

### plotting the cuts

fig = plt.figure('cuts')
ax = dict()
for i in range( 1, 10 ):
    ax[i] = fig.add_subplot( 3, 3, i )

dl = np.linspace( -.2, .2, 25)

#### xx
xdata = [ [ x, v ] for x,y,z,v in data if ( abs(y)<1e-3 and abs(z) < 1e-3 ) ]
vl = np.fromiter( ( f( x, 0, 0, *masses ) for x in dl ), np.float )
ax[1].plot( *zip(*sorted( xdata ) ), ls='', marker='o')
ax[1].plot( dl, vl )

#### xy
xydata = [ [ x, v ] for x, y, z, v in data if ( abs( x - y )<1e-2 and abs(z) < 1e-3 ) ]
vl = np.fromiter( ( f( xy, xy, 0, *masses ) for xy in dl ), np.float )
ax[2].plot( *zip(*sorted( xydata ) ), ls='', marker='o')
ax[2].plot( dl, vl )

#### xz
xzdata = [ [ x, v ] for x, y, z, v in data if ( abs( x - z )<1e-2 and abs(y) < 1e-3 ) ]
vl = np.fromiter( ( f( xz, 0, xz, *masses ) for xz in dl ), np.float )
ax[3].plot( *zip(*sorted( xzdata ) ), ls='', marker='o')
ax[3].plot( dl, vl )

#### yy
ydata = [ [ y, v ] for x, y, z, v in data if ( abs(x)<1e-3 and abs(z) < 1e-3 ) ]
vl = np.fromiter( ( f( 0, y, 0, *masses ) for y in dl ), np.float )
ax[5].plot( *zip(*sorted( ydata ) ), ls='', marker='o' )
ax[5].plot( dl, vl )

#### yz
yzdata = [ [ y, v ] for x, y, z, v in data if ( abs( y - z )<1e-2 and abs(x) < 1e-3 ) ]
vl = np.fromiter( ( f( 0, yz, yz, *masses ) for yz in dl ), np.float )
ax[6].plot( *zip(*sorted( yzdata ) ), ls='', marker='o')
ax[6].plot( dl, vl )

#### zz
zdata = [ [ z, v ] for x, y, z, v in data if ( abs(x)<1e-3 and abs(y) < 1e-3 ) ]
vl = np.fromiter( ( f( 0, 0, z, *masses ) for z in dl ), np.float )
ax[9].plot( *zip(*sorted( zdata ) ), ls='', marker='o' )
ax[9].plot( dl, vl )

#### some diag
ddata = [ [ z, v ] for x, y, z, v in data if ( abs(x - y)<1e-3 and abs(x - z) < 1e-3 ) ]
vl = np.fromiter( ( f( d, d, d, *masses ) for d in dl ), np.float )
ax[7].plot( *zip(*sorted( ddata ) ), ls='', marker='o' )
ax[7].plot( dl, vl )

#### some other diag
ddata = [ [ z, v ] for x, y, z, v in data if ( abs(x - y)<1e-3 and abs(x + z) < 1e-3 ) ]
vl = np.fromiter( ( f( d, d, -d, *masses ) for d in dl ), np.float )
ax[8].plot( *zip(*sorted( ddata ) ), ls='', marker='o' )
ax[8].plot( dl, vl )

plt.show()

This gives the following output:

solution
[-1.46528595  0.25090717  0.25090717 -1.46528595  0.25090717 -1.46528595 -0.01993436]
masses:
[-0.6824606499739905, 3.985537743156507, 3.9855376943660676, -0.6824606473928339, 3.9855377322848344, -0.6824606467055248, -50.16463861555409]
covariance matrix:
[
[ 4.76417852e-03 -1.46907683e-12 -8.57639600e-12 -2.21281938e-12 -2.38444957e-12  8.42981521e-12 -2.70034183e-05]
[-1.46907683e-12  9.17104397e-04 -7.10573582e-13  1.32125214e-11  7.44553140e-12  1.29909935e-11 -1.11259046e-13]
[-8.57639600e-12 -7.10573582e-13  9.17104389e-04 -8.60004172e-12 -6.14797647e-12  8.27070243e-12  3.11127064e-14]
[-2.21281914e-12  1.32125214e-11 -8.60004172e-12  4.76417860e-03 -4.20477032e-12  9.20893224e-12 -2.70034186e-05]
[-2.38444957e-12  7.44553140e-12 -6.14797647e-12 -4.20477032e-12  9.17104395e-04  1.50963408e-11 -7.28889534e-14]
[ 8.42981530e-12  1.29909935e-11  8.27070243e-12  9.20893175e-12  1.50963408e-11  4.76417849e-03 -2.70034182e-05]
[-2.70034183e-05 -1.11259046e-13  3.11127064e-14 -2.70034186e-05 -7.28889534e-14 -2.70034182e-05  5.77019926e-07]
]
sum of residuals
4.352727352163743e-09

...and here some 1d cuts that show some significant deviation from parabolic behaviour if one is not on one of the main axes.

some 1d cuts

mikuszefski
  • 3,943
  • 1
  • 25
  • 38
  • Thank you very much for your insightful answer. This gives much more reasonable results, but eigenvalues of the coefficient matrix are very often still positive, which they shouldn't be (for the data sets I have at least). The diagonal coefficients still seem to be fitted badly... – roman Jan 27 '20 at 14:32
  • @roman "fitted badly" and not getting what you expect are usually two different things. you may plot your data on the according cuts, e.g. where `y` and `z` are approximately zero. Then you can have a look at the `x` parabola. It also depends on how noisy your data is. – mikuszefski Jan 27 '20 at 16:09
  • I've quickly plotted the data along the axes and diagonals [here](https://www.dropbox.com/s/vs95zhlvsg15r0f/fit_results.png?dl=0). Black dots are the original data, red dots the fitted parabola curve. If I actually change the sign of the coefficients of the diagonal elements, I seem to get a closer fit. However, the eigenvalues of the coefficient matrix are still off. I'm not sure at this point if I've understood the problem of effective masses at all. – roman Jan 27 '20 at 17:25
  • Yes, of course - [this example](https://www.dropbox.com/s/vlyhw03wpgjf34r/silicon.csv?dl=0) is the data for silicon. The first three columns are x, y, z and the fourth column is the value – roman Jan 28 '20 at 13:26
  • @roman here with your data now. I'd say that the data shows already deviations from the simple second order. – mikuszefski Jan 29 '20 at 10:15
  • Thank you very much for taking so much of your time helping me. Now I know at least which problem I have to tackle. Maybe I can come up with a work-around by introducing some weights on the points far away from the maximum. – roman Jan 29 '20 at 19:49
  • @roman happy if it helped....btw...feel free to upvote or accept. – mikuszefski Jan 31 '20 at 09:33