Skip to main content

Confidence Intervals in model fitting

Confidence Intervals

In this project, I will show how to retrieve confidence intervals from given errors on best-fit parameters. I will be using the same limb model adopted in the rest of the projects, in addition to the error retrieval methods explained in error_estimation

In [1]:
## importing the libraries
import numpy as np
import math as m
import scipy
import matplotlib.pyplot as plt
from scipy import special
from scipy.optimize import curve_fit
In [2]:
def Erfc(x,sigma):

 y = special.erfc(x/(sigma*np.sqrt(2)))
 return y

## Introducing the model to be used later for the fitting

def SL_fit(x,w1,w2,w3,s1,s2,s3):

  f = 0.5*(w1*Erfc(x,s1)+w2*Erfc(x,s2)+w3*Erfc(x,s3)+ (1-w1-w2-w3))
  return f
In [3]:
## Importing data
path = "/home/fatima/Desktop/project_3/"
file = np.loadtxt(path+'limb_profile_av_norm_shifted')
x = file[:,0]
y = file[:,1]
ind = np.where(x>=0)
x = x[ind]
y = y[ind]
weights = np.sqrt(np.abs(y))
In [4]:
p0=[0.3, 0.3, 0.2, 1, 2, 3] ## initial guess best-fit parameters
popt, pcov = curve_fit(SL_fit,x,y,p0,method='lm',sigma=weights,ftol=1e-8,xtol=1e-8,maxfev=5000)
chi_sq_w = np.sum((1/weights**2)*(SL_fit(x,*popt)-y)**2)
red_chi_sq = chi_sq_w/(len(y)-len(popt))
print popt # to print the best-fit parameters
[ 0.52750103  0.28882568  0.10191755  0.25905336  0.76540583  2.83343007]

To compute the errors on the best-fit parmeters:

In [5]:
perr = np.sqrt(np.diag(pcov))
print perr 
[ 0.00349415  0.0028457   0.00119396  0.00135734  0.00715294  0.04738134]

The errors returned by curve_fit is the standard error with 67% confidence interval. If we want to derive for example the 95% CI, we should look up for the Z-table. However, this is only valid if the parameters follow a normal distribution, which is not the case for small sample size. This is where the t-distribution comes in handy.

In [6]:
from scipy.stats.distributions import  t
In [8]:
alpha = 0.05 # 95% confidence interval
N = len(y)
P = len(popt)
dof = max(0,N-P) ## dof is the degrees of freedom
tval = t.ppf(1.0 - alpha / 2.0, dof) 
for i, p,var in zip(range(N), popt, np.diag(pcov)):
    Sigma = var**0.5
    print 'c{0}: {1} [{2}  {3}]'.format(i, p,p - Sigma*tval,p + Sigma*tval)
c0: 0.527501034975 [0.520622132349  0.534379937602]
c1: 0.288825684997 [0.283223380201  0.294427989793]
c2: 0.101917547745 [0.0995670000292  0.104268095461]
c3: 0.259053363897 [0.256381173639  0.261725554154]
c4: 0.765405833574 [0.751323900146  0.779487767002]
c5: 2.83343007291 [2.74015081783  2.92670932799]

Voila!