How to perform a stacked analysis?

Download Notebook

A stacked analysis is a binned analysis where all data from multiple observations are stacked into a single counts cube.

As usual we start with Python imports

[1]:
import gammalib
import ctools
import cscripts

The data and observation definition XML file obs.xml used here are those produced in the previous tutorial How to combine observations?

Stacking Events

The event stacking is done using the ctbin tool. Instead of providing to ctbin an event list you should specify the observation definition XML file on input. ctbin will then loop over all observations and collect all events into a single counts cube.

[2]:
evbin = ctools.ctbin()
evbin['inobs']    = 'obs.xml'
evbin['xref']     = 83.63  # deg; we center the cube at the position of the Crab nebula
evbin['yref']     = 22.01  # deg
evbin['proj']     = 'CAR'
evbin['coordsys'] = 'CEL'
evbin['binsz']    = 0.02   # deg/bin
evbin['nxpix']    = 200
evbin['nypix']    = 200
evbin['ebinalg']  = 'LOG'
evbin['emin']     = 0.1    # TeV
evbin['emax']     = 100.0  # TeV
evbin['enumbins'] = 20
evbin['outobs']   = 'cntcube.fits'
evbin.execute()

Computing the stacked response and background

You now have a stacked counts cube cntcube.fits on disk. Before you can use that counts cube in a maximum likelihood analysis, you need to compute the stacked instrument response function and the background model that is needed for the analysis.

For the instrument response function, you have to compute the total exposure for the stacked cube (i.e. the sum of the effective areas for each observation multiplied by the corresponding livetimes) and an effective point spread function (i.e. the point spread function of the different observations weighted by the corresponding exposures). Optionally, you can also compute an effective energy dispersion (i.e. the energy dispersion of the different observations weighted by the corresponding exposures). To get these informations you use the ctexpcube, ctpsfcube and ctedispcube tools:

[3]:
expcube = ctools.ctexpcube()
expcube['inobs']   = 'obs.xml'
expcube['caldb']   = 'prod2'
expcube['irf']     = 'South_0.5h'
expcube['incube']  = 'cntcube.fits'  # exposure cube definition is copied from counts cube
expcube['outcube'] = 'expcube.fits'
expcube.execute()
[4]:
psfcube = ctools.ctpsfcube()
psfcube['inobs']    = 'obs.xml'
psfcube['caldb']    = 'prod2'
psfcube['irf']      = 'South_0.5h'
psfcube['incube']   = 'NONE'
psfcube['xref']     = 83.63
psfcube['yref']     = 22.01
psfcube['proj']     = 'CAR'
psfcube['coordsys'] = 'CEL'
psfcube['binsz']    = 1.0   # deg/bin; the PSF only varies slowly
psfcube['nxpix']    = 10
psfcube['nypix']    = 10
psfcube['emin']     = 0.1
psfcube['emax']     = 100.0
psfcube['enumbins'] = 20
psfcube['outcube']  = 'psfcube.fits'
psfcube.execute()
[5]:
edispcube = ctools.ctedispcube()
edispcube['inobs']    = 'obs.xml'
edispcube['caldb']    = 'prod2'
edispcube['irf']      = 'South_0.5h'
edispcube['incube']   = 'NONE'
edispcube['xref']     = 83.63
edispcube['yref']     = 22.01
edispcube['proj']     = 'CAR'
edispcube['coordsys'] = 'CEL'
edispcube['binsz']    = 1.0   # deg/bin; the energy dispersion only varies slowly
edispcube['nxpix']    = 10
edispcube['nypix']    = 10
edispcube['emin']     = 0.1
edispcube['emax']     = 100.0
edispcube['enumbins'] = 20
edispcube['outcube']  = 'edispcube.fits'
edispcube.execute()

You have noticed that for ctexpcube you provided an input counts cube, while for the other tools you specified NONE. By providing an input counts cube you instructed ctexpcube to extract the geometry of the cube from the counts cube. This is a convenient trick to reduce the number of user parameters that you need to specify. You did however not apply this trick for ctpsfcube and ctedispcube. In fact, the point spread function and energy dispersion do not vary significantly on spatial scales of 0.02°, and using the counts cube definition for these cubes would lead to large response cube files with a spatial precision that is actually not needed (the point spread function and energy dispersion cubes are actually 4-dimensional data cubes, hence their size increases quickly for a large number of spatial pixels). Therefore, you have specified a larger image scale of 1° for both cubes and only a small number of 10x10 spatial pixels, leading to point spread function and energy dispersion cubes of modest size (a few MB).

You provided the obs.xml file that defines all observations on input so that the tools know which observations were combined in the ctbin run. As final step of the analysis preparation, you need to generate a background cube using the ctbkgcube tool.

[6]:
bkgcube = ctools.ctbkgcube()
bkgcube['inobs']    = 'obs.xml'
bkgcube['caldb']    = 'prod2'
bkgcube['irf']      = 'South_0.5h'
bkgcube['incube']   = 'cntcube.fits'
bkgcube['inmodel']  = '$CTOOLS/share/models/crab.xml'
bkgcube['outcube']  = 'bkgcube.fits'
bkgcube['outmodel'] = 'model.xml'
bkgcube.execute()

The usage of ctbkgcube is very similar to that of ctexpcube, yet it takes the model definition XML file as an additional input parameter. You used here the usual $CTOOLS/share/models/crab.xml model file that is shipped with the ctools. ctbkgcube provides on output the background cube file bkgcube.fits and the model definition XML file model.xml that can be used for further analysis. Having a look at the model.xml file illustrates how the background modelling works:

[7]:
print(gammalib.GXml('model.xml'))
=== GXml ===
GXmlDocument::version="1.0" encoding="UTF-8" standalone="no"
GXmlElement::source_library title="source library"
  GXmlElement::source name="Crab" type="PointSource"
    GXmlElement::spectrum type="PowerLaw"
      GXmlElement::parameter name="Prefactor" value="5.7" error="0" scale="1e-16" min="1e-07" max="1000" free="1"
      GXmlElement::parameter name="Index" value="2.48" error="0" scale="-1" min="0" max="5" free="1"
      GXmlElement::parameter name="PivotEnergy" value="0.3" scale="1000000" min="0.01" max="1000" free="0"
    GXmlElement::spatialModel type="PointSource"
      GXmlElement::parameter name="RA" value="83.6331" scale="1" min="-360" max="360" free="0"
      GXmlElement::parameter name="DEC" value="22.0145" scale="1" min="-90" max="90" free="0"
  GXmlElement::source name="BackgroundModel" type="CTACubeBackground" instrument="CTA,HESS,MAGIC,VERITAS"
    GXmlElement::spectrum type="PowerLaw"
      GXmlElement::parameter name="Prefactor" value="1" error="0" scale="1" min="0.01" max="100" free="1"
      GXmlElement::parameter name="Index" value="0" error="0" scale="1" min="-5" max="5" free="1"
      GXmlElement::parameter name="PivotEnergy" value="1" scale="1000000" free="0"

The Crab source component is the same that is also present in $CTOOLS/share/models/crab.xml and is not modified. The background component, however, has been replaced by a model of type CTACubeBackground. This model is a 3-dimensional data cube that describes the expected background rate as function of spatial position and energy. The data cube is multiplied by a power law spectrum that allows to adjust the normalization and slope of the background spectrum in the fit. This power law could be replaced by any spectral model that is found as an appropriate multiplicator to the background cube.

There is no constraint on providing the same spatial binning or the same energy binning for an exposure cube, a PSF cube, an energy dispersion cube, a background cube and a counts cube. ctools interpolates internally all response cubes hence any arbitrary appropriate binning may be used. Using the same binning for the exposure cube, the background cube and the counts cube is only a convenience.

Likelihood fitting

Now you have all files at hand to perform a stacked maximum likelihood analysis using the ctlike tool:

[8]:
like = ctools.ctlike()
like['inobs']    = 'cntcube.fits'
like['expcube']  = 'expcube.fits'
like['psfcube']  = 'psfcube.fits'
like['bkgcube']  = 'bkgcube.fits'
like['inmodel']  = 'model.xml'
like['outmodel'] = 'crab_results.xml'
like.execute()

ctlike uses as input observations the counts cube and therefore needs also the exposure cube, the PSF cube, and the background cube file names.

The results of the ctlike run are shown below.

[9]:
print(like.opt())
=== GOptimizerLM ===
 Optimized function value ..: 87667.270
 Absolute precision ........: 0.005
 Acceptable value decrease .: 2
 Optimization status .......: converged
 Number of parameters ......: 10
 Number of free parameters .: 4
 Number of iterations ......: 2
 Lambda ....................: 1e-05
[10]:
print(like.obs().models())
=== GModels ===
 Number of models ..........: 2
 Number of parameters ......: 10
=== GModelSky ===
 Name ......................: Crab
 Instruments ...............: all
 Observation identifiers ...: all
 Model type ................: PointSource
 Model components ..........: "PointSource" * "PowerLaw" * "Constant"
 Number of parameters ......: 6
 Number of spatial par's ...: 2
  RA .......................: 83.6331 [-360,360] deg (fixed,scale=1)
  DEC ......................: 22.0145 [-90,90] deg (fixed,scale=1)
 Number of spectral par's ..: 3
  Prefactor ................: 5.7351470584048e-16 +/- 7.15792803493375e-18 [1e-23,1e-13] ph/cm2/s/MeV (free,scale=1e-16,gradient)
  Index ....................: -2.45746587500157 +/- 0.010743485688059 [-0,-5]  (free,scale=-1,gradient)
  PivotEnergy ..............: 300000 [10000,1000000000] MeV (fixed,scale=1000000,gradient)
 Number of temporal par's ..: 1
  Normalization ............: 1 (relative value) (fixed,scale=1,gradient)
 Number of scale par's .....: 0
=== GCTAModelCubeBackground ===
 Name ......................: BackgroundModel
 Instruments ...............: CTA, HESS, MAGIC, VERITAS
 Observation identifiers ...: all
 Model type ................: "PowerLaw" * "Constant"
 Number of parameters ......: 4
 Number of spectral par's ..: 3
  Prefactor ................: 1.02157245578124 +/- 0.0113974414365358 [0.01,100] ph/cm2/s/MeV (free,scale=1,gradient)
  Index ....................: 0.00968334444956244 +/- 0.00675463165472791 [-5,5]  (free,scale=1,gradient)
  PivotEnergy ..............: 1000000 MeV (fixed,scale=1000000,gradient)
 Number of temporal par's ..: 1
  Normalization ............: 1 (relative value) (fixed,scale=1,gradient)

If you want to consider also the energy dispersion during the maximum likelihood fitting you set the edisp parameter to True and provide the energy dispersion cube to the edispcube parameter:

[11]:
like = ctools.ctlike()
like['inobs']     = 'cntcube.fits'
like['expcube']   = 'expcube.fits'
like['psfcube']   = 'psfcube.fits'
like['bkgcube']   = 'bkgcube.fits'
like['edisp']     = True                 # Set to True (is False by default)
like['edispcube'] = 'edispcube.fits'     # Provide energy dispersion cube
like['inmodel']   = 'model.xml'
like['outmodel']  = 'crab_results.xml'
# like.execute()                         # Uncomment if you have some time

The maximum likelihood computation including energy dispersion is more time consuming, and in many situations the impact of the energy dispersion on the analysis results will be very small. So make sure that you really need energy dispersion before you are using it. Uncomment the execute() call in the previous cell to perform the likelihood analysis including energy dispersion.

[ ]: