Students are required to hand in the exercises and questions below in the form of a SWAN notebook or pdf file (with code snippets) by Wednesday, May 16. Please send the material to Anna Sfyrla (anna.sfyrla@cern.ch) and T.J. Khoo. (Teng.Jian.Khoo@cern.ch).
As you heard in the lecture, a serious attempt at hypothesis tests requires us to take into account a few more effects that can cause us to overestimate the statistical significance of a result:
Today you will see how to incorporate treatments of these effects into the statistical models that we built last time.
# Usual setup
import ROOT
from ROOT import RooFit
ROOT.gStyle.SetOptStat(0) # Disable stat box
ROOT.enableJSVis()
inputfile = ROOT.TFile('files/ttreedata.root')
inputtree = inputfile.Get('Sample1a')
Nevents = inputtree.GetEntries()
c = ROOT.TCanvas('plot','plot',800,600)
w = ROOT.RooWorkspace()
w.factory('x[0.,200.]')
x = w.var('x')
inputdata = ROOT.RooDataSet('data_1a','data_1a',inputtree,ROOT.RooArgSet(x))
We reproduce the single-bin cut & count analysis from the previous demonstration. The first step is to do the sideband fit to extract the background estimate. This time, however, we want to also record how the number of estimated background events in the SR varies depending on the uncertainties on the fitted parameters.
# Define sidebands & SR
SRmin = 35
SRmax = 50
x.setRange('signalregion',SRmin,SRmax)
x.setRange('sideband_low',0.,SRmin)
x.setRange('sideband_high',SRmax,200.)
# Fit background pdf to sideband data
w.factory('Exponential::bkg(x,tau[-0.1,-0.2,0.])')
w.factory('SUM::bkg_ext(bkgnorm[1.,0.,1000]*bkg)')
bkgpdf = w.pdf('bkg_ext')
sidebanddata = ROOT.RooDataSet('sideband_1a','sideband_1a',inputtree,ROOT.RooArgSet(x),'x<{0} || x>{1}'.format(SRmin,SRmax))
fitresult = bkgpdf.fitTo(sidebanddata,RooFit.PrintLevel(-1),
RooFit.Range('sideband_low,sideband_high'),
RooFit.Extended())
bkgnorm = w.var('bkgnorm')
bkgnorm_nom = bkgnorm.getVal()
bkgnorm_err = bkgnorm.getError()
print "Estimated total background events: {0:.2f} +/- {1:.2f}".format(bkgnorm_nom, bkgnorm_err)
tau = w.var('tau')
tau_nom = tau.getVal()
tau_err = tau.getError()
print "Background exponential display constant: {0:.2g} +/- {1:.2g}".format(tau_nom, tau_err)
bkgint_SR = bkgpdf.createIntegral(ROOT.RooArgSet(x),ROOT.RooArgSet(x),'signalregion')
Nbkg_vals = {"Nominal": bkgint_SR.getVal()*bkgnorm_nom,
"BkgNormUp": None,
"BkgNormDown": None,
"TauUp": None,
"TauDown": None
}
print "Nominal SR Nbkg: {0:.2f}".format(Nbkg_vals["Nominal"])
We typically refer to the output we get from carrying out steps of our analysis (e.g. background estimates, parameter estimation from MLE) without taking into account systematic uncertainties as the "nominal" result, and then determine the size of our systematic uncertainties by comparing the nominal value with "systematic variations".
In this case, we could evaluate the systematic uncertainties on the background estimate by computing Nbkg
where instead of using the nominal bkgnorm
and tau
parameter values, we add/subtract the fitted uncertainties. These are what we refer to as $\pm 1\sigma$ variations.
HINT: You don't need to recreate the integral of bkgpdf
for each variation. Try changing the value of tau
by calling tau.setVal(...)
and see how the return value of bkgint_SR.getVal()
changes.
# Up to you...
for var,val in sorted(Nbkg_vals.iteritems()):
percentdelta = 100.*(val/Nbkg_vals["Nominal"]-1)
print "For variation '{0:12}', SR Nbkg = {1:.2f} (delta = {2:.2f}%)".format(var,val,percentdelta)
diff_BkgNorm = max(Nbkg_vals["BkgNormUp"]-Nbkg_vals["Nominal"],Nbkg_vals["Nominal"]-Nbkg_vals["BkgNormDown"])
w.factory('prod::delta_BkgNorm(sigma_BkgNorm[{0}],theta_BkgNorm[0.,-5.,5.])'.format(diff_BkgNorm))
diff_Tau = max(Nbkg_vals["TauUp"]-Nbkg_vals["Nominal"],Nbkg_vals["Nominal"]-Nbkg_vals["TauDown"])
w.factory('prod::delta_Tau(sigma_Tau[{0}],theta_Tau[0.,-5.,5.])'.format(diff_Tau))
w.factory('sum::Nbkg_var(Nbkg[{0},0,200],delta_BkgNorm,delta_Tau)'.format(Nbkg_vals["Nominal"]))
w.factory('sum::Nexp(Nsig[0,-20.,100.],Nbkg_var)')
w.factory('Poisson::pois(Nobs[100,0,250],Nexp)')
Now, we carry on with the hypothesis test, but need to extend the PDF to incorporate nuisance parameters for the two uncertainty components we have identified.
To set up a convention, we label these nuisance parameters with "$\theta$", and define them such that a value of $\theta_\mathrm{syst} = \pm1$ corresponds to a $\pm1\sigma$ variation of Nbkg
by $\sigma_\mathrm{syst}$, where "syst" can be "Tau" or "BkgNorm". The uncertainties we extracted above are pretty symmetric, so we can be a little conservative and define the $\sigma$ values to be the larger difference between the nominal value of Nbkg
and the up/down variations.
Then, we define a poisson pdf like last time, but build the Nexp
value from an expression that incorporates the $\theta$ nuisance parameters:
$$ N_\mathrm{exp} = N_\mathrm{sig} + N_\mathrm{bkg} + \sum_\mathrm{syst} (\sigma_\mathrm{syst} * \theta_\mathrm{syst}), $$
where $N_\mathrm{bkg}$ of course refers to the nominal background estimate.
Instead of using absolute uncertainties, we could instead compute relative uncertainties on $N_\mathrm{bkg}$ and then apply them using the formula $$ N_\mathrm{exp} = N_\mathrm{sig} + N_\mathrm{bkg} \times \prod_\mathrm{syst} (1 +\sigma_\mathrm{syst}^\mathrm{rel} * \theta_\mathrm{syst}). $$
What remains missing from the model is to allow the nuisance parameters to vary. Recall from the lecture that the standard way to do so is to restrict each one to be normal-distributed (i.e. to follow a gaussian with mean 0 and width 1).
# Up to you...
Now we proceed as before, extracting the Nsig
estimate from the observed number of events.
inputtree.Draw('>>elist','x>{0} && x<{1}'.format(SRmin,SRmax))
elist = ROOT.gDirectory.Get('elist')
NSR = elist.GetN() # Note: this is just a plain float, which we'll use to integrate for the p-value
print '*** Number of total (observed) events in SR:', NSR
Nobs = w.var('Nobs')
Nobs.setVal(NSR)
Nbkg = w.var('Nbkg')
# We can do the following if Nobs > Nbkg
Nsig_cnc = Nobs.getVal() - Nbkg.getVal()
print 'A first estimate of the signal number of events: {0:.2f}'.format(Nsig_cnc)
As discussed in the lecture, we can now use the Profile Likelihood to carry out statistical tests, much as we did with the NLL previously. However, the nuisance parameters inject some elasticity into the likelihood, allowing it to absorb deviations from the model parameters more easily.
As before, we create the NLL, but then proceed to produce the PL from the NLL (or rather, the negative Profile Log Likelihood, abbreviated PLL). To do this, we have to declare the parameters of interest to RooFit, i.e. state which parameters are not nuisance parameters to be profiled.
Is the plot of the PLL compared to the NLL w/o systematics as you would expect it to be?
Nobs_bkg = ROOT.RooDataSet('Nobs_bkg', 'Nobs_bkg', ROOT.RooArgSet(Nobs))
Nobs_bkg.add(ROOT.RooArgSet(Nobs))
pois_with_constraint = w.pdf('pois_with_constraint')
nll_cnc = pois_with_constraint.createNLL(Nobs_bkg)
Nsig = w.var('Nsig')
Nobs.setVal(NSR)
Nsigframe = Nsig.frame()
# We declared the poisson+constraint PDF to have four parameters: Nsig, Nbkg, theta_BkgNorm, theta_Tau
# So declare the first two as POIs
pll_cnc = nll_cnc.createProfile(ROOT.RooArgSet(Nsig,Nbkg))
# We set the plotted min to 0, as we're only interested in differences wrt the min NLL
nll_cnc.plotOn(Nsigframe,RooFit.ShiftToZero())
pll_cnc.plotOn(Nsigframe,RooFit.LineColor(ROOT.kRed),RooFit.ShiftToZero())
Nsigframe.SetMaximum(10)
Nsigframe.SetMinimum(-1)
Nsigframe.Draw()
c.Draw()
As before, we can extract the p-values for the models with and without systematics. Now, we rely on the stronger version of Wilks's theorem, stating that the $\chi^2$ approximation is still valid in the case of the profile likelihood.
Note: Because evaluating the PLL requires a scan over the NPs, we need to reset the thetas to 0 if we want to evaluate the plain poisson NLL safely.
w.var('theta_Tau').setVal(0)
w.var('theta_BkgNorm').setVal(0)
Nsig.setVal(0)
nll_0sig = nll_cnc.getVal()
pll_0sig = pll_cnc.getVal()
w.var('theta_Tau').setVal(0)
w.var('theta_BkgNorm').setVal(0)
Nsig.setVal(Nsig_cnc)
nll_obs = nll_cnc.getVal()
pll_obs = pll_cnc.getVal()
# We divide by two because we assigned 50% of the probability to the case
# where the observation is no greater than the background prediction,
# and the chi2 probability is symmetric.
p0_nosyst = ROOT.TMath.Prob( 2*(nll_0sig-nll_obs), 1)/2
z_nosyst = ROOT.RooStats.PValueToSignificance(p0_nosyst)
print "P-value based on DeltaNLL = {0:.3g} ==> {1:.3g} sigma".format(p0_nosyst,z_nosyst)
p0_syst = ROOT.TMath.Prob( 2*(pll_0sig-pll_obs), 1)/2
z_syst = ROOT.RooStats.PValueToSignificance(p0_syst)
print "P-value based on DeltaPLL = {0:.3g} ==> {1:.3g} sigma".format(p0_syst,z_syst)
The uncertainties we determined from the background fit were pretty small, and as a result don't have a very significant impact on the p-value.
Here is the basic poisson product PDF for a three-bin signal region. It is of course up to you to choose where you would like to place the bin boundaries. Feel free to adjust variable ranges depending on which bin boundaries you choose. You could even consider automating the setup such that you could define an arbitrary set of bins.
w.factory('sum::Nexp_1(Nbkg_1[50,0,100],Nsig_1[0,0,100])')
w.factory('sum::Nexp_2(Nbkg_2[50,0,100],Nsig_2[0,0,100])')
w.factory('sum::Nexp_3(Nbkg_3[50,0,100],Nsig_3[0,0,100])')
w.factory('Poisson::pois_bin1(Nobs_1[50,0,100],Nexp_1)')
w.factory('Poisson::pois_bin2(Nobs_2[50,0,100],Nexp_2)')
w.factory('Poisson::pois_bin3(Nobs_3[50,0,100],Nexp_3)')
w.factory('PROD::pois_multibin(pois_bin1,pois_bin2,pois_bin3)')
Then, as before, we extract the background estimates and bin counts for each of the signal region bins.
binrange = [(35,40),(40,45),(45,50)]
Nsig_obs_bins = [0.,0.,0.]
for ibin in range(len(binrange)):
binxlow,binxhigh = binrange[ibin]
x.setRange('SR_bin{0}'.format(ibin),binxlow,binxhigh)
bkgint = bkgpdf.createIntegral(ROOT.RooArgSet(x),ROOT.RooArgSet(x),'SR_bin{0}'.format(ibin))
bkgest = bkgint.getVal()*bkgnorm_nom
inputtree.Draw('>>elist','x>{0} && x<{1}'.format(binxlow,binxhigh))
elist = ROOT.gDirectory.Get('elist')
bincount = elist.GetN()
w.var('Nbkg_{0}'.format(ibin+1)).setVal(bkgest)
w.var('Nobs_{0}'.format(ibin+1)).setVal(bincount)
Nsig_obs_bins[ibin] = bincount - bkgest
Nobs_multibin = ROOT.RooDataSet('Nobs_multibin', 'Nobs_multibin', ROOT.RooArgSet(w.var('Nobs_1'),w.var('Nobs_2'),w.var('Nobs_3')))
Nobs_multibin.add(ROOT.RooArgSet(w.var('Nobs_1'),w.var('Nobs_2'),w.var('Nobs_3')))
nll_multibin = w.pdf('pois_multibin').createNLL(Nobs_multibin)
We can compute the deltaNLL and thereby the p-value and significance as before.
for ibin in range(3):
w.var('Nsig_{0}'.format(ibin+1)).setVal(0)
nll_multibin_0sig = nll_multibin.getVal()
for ibin in range(3):
w.var('Nsig_{0}'.format(ibin+1)).setVal(Nsig_obs_bins[ibin])
nll_multibin_obs = nll_multibin.getVal()
dnll_multibin = nll_multibin_0sig - nll_multibin_obs
p0_multibin = ROOT.TMath.Prob( 2*(dnll_multibin), 1)/2
z_multibin = ROOT.RooStats.PValueToSignificance(p0_multibin)
print 'Discovery p-value for {0}-bin counting experiment = {1:.2g} (Z = {2:.2f})'.format(len(binrange),p0_multibin,z_multibin)
Hint: Don't forget that you need to create new variable and pdf names for the cases with the nuisance parameters. It might be helpful to experiment in a standalone python script, or use another notebook where you can keep the code together. If you need to overwrite existing objects in the workspace, either restart the kernel, or call .Delete()
on the object first.
We want to return to the bump-hunt (gaussian + exponential fit) we did previously and compare the local and global significance of the observed bump in Sample1a. We will do this in several steps.
First, we set up the signal+background model for the "bump-hunt" analysis that we performed previously, but restrict ourselves to a scan only in the mass of the signal resonance, fixing the width to be a fraction of the mass.
w.factory('prod::sigma(widthfrac[0.1],mu[40,0,200])')
w.factory('Gaussian::sig(x,mu,sigma)')
w.factory('SUM::splusb(Nsig*sig,bkgnorm*bkg)')
sbpdf = w.pdf('splusb')
fitresult = sbpdf.fitTo(inputdata,ROOT.RooFit.Save(),ROOT.RooFit.PrintLevel(-1))
fitresult.Print()
Nsig_fit = w.var('Nsig').getVal()
xframe = x.frame()
inputdata.plotOn(xframe)
sbpdf.plotOn(xframe)
sbpdf.plotOn(xframe,RooFit.LineStyle(ROOT.kDashed),RooFit.Components('bkg'))
sbpdf.plotOn(xframe,RooFit.LineColor(ROOT.kRed),RooFit.Components('sig'))
xframe.Draw()
c.Draw()
As usual, we extract the p-value and significance for the fitted signal. Remember, this is the local p-value for a single value of the mass parameter (i.e. the MLE).
nll_fit = sbpdf.createNLL(inputdata)
Nsig.setVal(0)
nll_0sig = nll_fit.getVal()
Nsig.setVal(Nsig_fit)
nll_obs = nll_fit.getVal()
p0_fit = ROOT.TMath.Prob( 2*(nll_0sig-nll_obs), 1)/2
z_fit = ROOT.RooStats.PValueToSignificance(p0_fit)
print "P-value based on DeltaNLL = {0:.3g} ==> {1:.3g} sigma".format(p0_fit,z_fit)
We can scan the mass parameter over the full range and make the local p-value plot, to illustrate how many other significant bumps we might have found satisfying the constraints we placed on our signal. To do so, we need to make mu
a constant, so it is not allowed to vary in the fit. We also take the absolute delta NLL, mainly because a negative value cannot be evaluated for a $\chi^2$ probability.
Note: If you are keen-eyed, you might have noticed that we let the signal range go a little bit negative. For a few mu values, this leads to the PLL printing errors because the PDF is undefined, but RooFit can basically cope.
mu = w.var('mu')
mu.setConstant(True)
from numpy import arange
graph_plocal = ROOT.TGraph(200)
graph_zlocal = ROOT.TGraph(200)
for idx,mu_hypo in enumerate(arange(1.,201.,1.)):
mu.setVal(mu_hypo)
fitresult = sbpdf.fitTo(inputdata,ROOT.RooFit.Save(),ROOT.RooFit.PrintLevel(-1))
Nsig_fit = w.var('Nsig').getVal()
Nsig.setVal(0)
nll_0sig = nll_fit.getVal()
Nsig.setVal(Nsig_fit)
nll_obs = nll_fit.getVal()
dnll = abs(nll_0sig-nll_obs)
p0_local = ROOT.TMath.Prob( 2*dnll, 1)/2
z_local = ROOT.RooStats.PValueToSignificance(p0_local)
graph_plocal.SetPoint(idx,mu_hypo,p0_local)
graph_zlocal.SetPoint(idx,mu_hypo,z_local)
Now we draw the distribution of the local Z significance (feel free to draw the p-value as well if you find that more intuitive). The line at a chosen $Z_\mathrm{test}$ is marked.
z_test = 1
zframe = ROOT.TH2F("zframe","zframe",1,0,200,1,0,7)
zframe.Draw()
graph_zlocal.SetLineWidth(2)
graph_zlocal.SetLineColor(ROOT.kBlue)
graph_zlocal.Draw("c")
Ztestline = ROOT.TLine(0,z_test,200,z_test)
Ztestline.SetLineWidth(2)
Ztestline.SetLineStyle(ROOT.kDashed)
Ztestline.SetLineColor(ROOT.kRed)
Ztestline.Draw()
c.Draw()
For the asymptotic trial factor estimates, we want to use: $$ N_\mathrm{trials} = 1 + \frac{1}{p_\mathrm{local}} \left\langle N_\mathrm{up} (Z_\mathrm{test}) \right\rangle \exp\left( \frac{Z_\mathrm{test}^2-Z_\mathrm{local}^2}{2} \right) $$ where $N_\mathrm{up}$ is the number of "up-crossings", i.e. the number of times $Z_\mathrm{local} > Z_\mathrm{test}$ (i.e. we count the number of excesses that exceed some test significance, and then use this to calibrate the trial factor.
isAboveZtest = False
Nup = 0
# We only want to count the crossings, so need to test for when the Zlocal increases past Ztest
for ipoint in xrange(graph_zlocal.GetN()):
z_local = graph_zlocal.GetY()[ipoint]
if z_local > z_test:
if not isAboveZtest:
Nup += 1
isAboveZtest = True
else:
isAboveZtest = False
print "Number of upcrossings for Ztest={0:.1f} is {1}".format(z_test,Nup)
import math
p0_global = p0_fit + Nup * math.exp( 0.5 * (z_test*z_test - z_fit*z_fit) )
z_global = ROOT.RooStats.PValueToSignificance(p0_global)
Ntrials = p0_global / p0_fit
print "Trials factor for local p0 = {0:.2g} (Z = {1:.2f}) is {2}".format(p0_fit,z_fit,Ntrials)
print "Global p0 = {0:.2g} (Z = {1:.2f})".format(p0_global,z_global)
Not a very impressive result after all! Try changing z_test
, and see if the result changes. Also, switch to the high statistics data sample and repeat the process. How much does a factor of 10 increase in statistics improve the global p0? Does the LEE matter more for low- or high-significance results?
Hints: Remember, this is the file called "ttreedata_highstats.root", not "_hs.root". You may need to change some of the allowable ranges for a few fit parameters