Curious Now

Story

A response-matrix-centred approach to presenting cross-section measurements

PhysicsEnergy

Key takeaway

New method makes it easier to understand particle interactions and cross-section data, which is important for improving the accuracy of particle physics experiments and simulations.

Read the paper

Quick Explainer

The core idea is to publish a detector response matrix alongside cross-section measurement data, rather than just the unfolded distributions. This allows users to forward-fold their own model predictions through the response matrix and compare to the raw data, without needing access to the full detector simulation. The response matrix is designed to be as model-independent as possible by carefully choosing the truth-level binning. Systematic uncertainties in the detector response are handled by providing a set of toy response matrices that are marginalized over. This approach combines the model-independence of multi-dimensional unfolding with the ability to work with low-statistics, coarsely binned data, and provides a convenient way to include background events.

Deep Dive

Technical Deep Dive: A Response-Matrix-Centred Approach to Presenting Cross-Section Measurements

Overview

This work presents a response-matrix-centred approach for publishing cross-section measurement data in a more model-independent way. The key ideas are:

  • There is a linear relationship between true physics expectation values (at the generator level) and expected number of measured events, described by a detector response matrix.
  • The response matrix and raw data are published as the primary outputs, rather than unfolded distributions.
  • This allows users to test arbitrary models against the data by forward-folding the model predictions through the response matrix.
  • The response matrix is designed to be as model-independent as possible by carefully choosing the truth-level binning.
  • Systematic uncertainties in the detector response are handled by providing a set of toy response matrices that are marginalized over.
  • Background events can be included in the likelihood calculation by giving them their own bins in truth space.

Methodology

Building the Response Matrix

  • Events are categorized by their true (generator-level) properties into truth bins, and by their reconstructed properties into reco bins.
  • The probability $P(j\rightarrow i)$ for an event in truth bin $j$ to be reconstructed in reco bin $i$ is calculated from MC simulations.
  • This probability is collected into a response matrix $R_{ij}$, which models both the selection efficiency and reconstruction smearing.
  • Systematic uncertainties in the detector response are incorporated by generating a set of toy response matrices, where the detector properties are varied according to their prior distributions.
  • Statistical uncertainties in the matrix elements are handled using a Bayesian approach, modeling the efficiency and smearing separately as beta and Dirichlet distributions.

Using the Response Matrix

  • The response matrix and raw data are published as the primary outputs.
  • To test a model, its truth-level predictions are forward-folded through the response matrix to get reco-level predictions.
  • The likelihood of the data under this folded prediction is then calculated, marginalizing over the set of toy response matrices.
  • This allows testing arbitrary models against the data without needing to redo the detector simulation.
  • Backgrounds can be included by giving them their own truth bins, with their own response parametrization.

Validating Model-Independence

  • The model-independence of the response matrix is tested by comparing matrices built from different event generators.
  • The Mahalanobis distance between the matrices is calculated, and the compatibility is assessed based on the probability of observing a distance at least as extreme.
  • Passing this test is a necessary but not sufficient condition for model-independence.

Results & Interpretation

The response-matrix-centred approach offers several advantages over traditional unfolding methods:

  • It combines the model-independence of multi-dimensional unfolding with the ability to work with low-statistics, coarsely binned data.
  • It enables non-collaborators to easily test their models against the published data, without needing access to the full detector simulation.
  • For high-statistics measurements, the reco-space comparisons can offer superior model-separation power over unfolded truth-space comparisons.
  • It provides a convenient way to handle backgrounds, without requiring background subtraction.

The main challenge is choosing an appropriate truth-level binning that balances model-independence, detector performance, and statistical uncertainties. Careful validation is required to ensure sufficient model-independence.

Limitations & Uncertainties

  • The response matrix can only be as model-independent as the truth-level binning allows. Hidden model dependencies may still exist if the binning does not capture all relevant detector effects.
  • The approach relies on having sufficient Monte Carlo statistics to populate the truth bins. Bins with low or no events cannot be tested.
  • Systematic uncertainties in the detector response are challenging to fully characterize and propagate through the likelihood calculations.

What Comes Next

The authors have developed an open-source Python framework called "Response Matrix Utilities" to facilitate the use of this approach. They envision it being useful for global data comparisons and fits within frameworks like NUISANCE and Rivet.

Further work is needed to:

  • Explore alternative methods for validating model-independence beyond the Mahalanobis distance test.
  • Investigate techniques to handle truth bins with insufficient MC statistics.
  • Optimize the binning strategy to balance model-independence, detector performance, and statistical uncertainties.
  • Integrate the response-matrix-centred approach into existing analysis workflows.

Sources: [1] A response-matrix-centred approach to presenting cross-section measurements (arXiv preprint) [2] Response Matrix Utilities framework documentation

Source