{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 8: Unfolding" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In almost any experiment, the measured variable is not directly the physics quantity of interest.\n", "Instead, the answer of the experimental apparatus has to be translated by the experimentalist, e.g., the signal height might be related to the energy of a particle.\n", "In real experiments, it is impossible to always find the true value of the physics quantity, since the translation suffers from various uncertainties.\n", "E.g., the measured signal might be distorted due to noise and systematic limitations of the experiment.\n", "Moreover, the *response function* of the experiment is convoluted in the measured signal.\n", "In most cases, an analytical deconvolution of the signal is impossible or at least not practical.\n", "Instead, numerical methods are applied to find the distribution of true values and their uncertainties for a given distribution of measured values.\n", "This process is called unfolding.\n", "\n", "In this exercise, we consider only discrete distributions with $N$ bins, corresponding either to histograms of continuous variables (like energy) or naturally discrete variables (like the mass number of atomic nuclei).\n", "Provided that the true and measured distributions have the same number of bins, the experimental response can be described by an $N \\times N$ migration (or transfer) matrix $R$,\n", "where each element $R_{ij}$ describes the probability of a true value in bin $i$ to be classified or measured, respectively, in bin $j$.\n", "This is reflected in the relation\n", "\n", "$$\\vec{g} = R \\cdot \\vec{f},$$\n", " \n", "where $\\vec{f}$ is a vector with $N$ elements containing the true values for each bin, $R$ is the migration matrix corresponding to the experimental response,\n", "and $\\vec{g}$ is a vector containing the measured / observed values for each bin. This means that an ideal experiment would have the unit matrix as migration matrix.\n", "\n", "The first step of the unfolding procedure is the determination of the migration matrix, e.g., by calibration measurements or simulations.\n", "We assume that this step has been done already, and the exercises focus on the second step: unfolding of a given measured distribution provided that the migration matrix is known.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### **Hints:**\n", "\n", "Numpy offers several matrix operations, for example:\n", "\n", "[`linalg.inv(R)`](https://numpy.org/doc/stable/reference/generated/numpy.linalg.inv.html) returns the inverse of the a (migration) matrix `R`. See also [`np.matrix.I`](https://numpy.org/doc/stable/reference/generated/numpy.matrix.I.html) which performs the same operation.\n", "\n", "[`lambda, U = linalg.eig(R)`](https://numpy.org/doc/stable/reference/generated/numpy.linalg.eig.html) returns the eigenvalues and the matrix `U` of eigenvecotrs you need to diagonalize `R`.\n", "\n", "[**Normally you should not capitalize variables in python!**](https://www.python.org/dev/peps/pep-0008/#function-and-variable-names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 8.1: Simple Case with 2 Categories (voluntary)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The simplest situation is an experiment with $N=2$, for example the number of fouls per team in a soccer match between team A and B.\n", "Let us assume that the referee is biased and favors team B.\n", "Whenever the referee detects a foul committed by team B, he gives a free kick to team A in only **70%** of the cases, but a free kick to team B in the remaining **30%** of the cases.\n", "Vice-versa, when team A commits a foul, team B gets the free kick in **90%** of the cases, but team A only in the remaining **10%** of the cases.\n", "\n", "Thus, the diagonal entries of the migration matrix $R$ are 0.9 and 0.7, and the off-diagonal elements are 0.1 and 0.3 (think about which is the correct location of each element).\n", "\n", "In the match there have been **22** free kicks for team A, and **24** for team B.\n", "Construct the migration matrix $R$, and estimate by inversion of $R$ how many fouls each team has committed.\n", "\n", "**You can do this using LaTeX in Markdown, or in Python with Numpy arrays**." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Free kicks of team A:\t22.0 \t (= foul count for team B)\n", "Free kicks of team B:\t24.0 \t (= foul count for team A)\n", "Fraction of wrong decisions, for team A:\t0.1\n", "Fraction of wrong decisions, for team B:\t0.3\n" ] } ], "source": [ "epsilon1 = 0.1 # 10% wrong detection probability, if team A commits foul\n", "epsilon2 = 0.3 # 30% wrong detection probability, if team B commits foul\n", "foulsDetectedForTeamA = 24. # = number of free kicks for team B\n", "foulsDetectedForTeamB = 22. # = number of free kicks for team A\n", "\n", "# Data from exercise sheet \n", "print(f\"Free kicks of team A:\\t{foulsDetectedForTeamB} \\t (= foul count for team B)\"),\n", "print(f\"Free kicks of team B:\\t{foulsDetectedForTeamA} \\t (= foul count for team A)\")\n", "print(f\"Fraction of wrong decisions, for team A:\\t{epsilon1}\")\n", "print(f\"Fraction of wrong decisions, for team B:\\t{epsilon2}\")\n", " \n", "# TODO: Unfold the referees bias decisions using numpy!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`# TODO` ... or in Markdown!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise 8.2: Regularization **(obligatory)**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we consider a counting experiment with 7 categories (bins): each bin contains a number of observed events (e.g., particle decays, cars with different colors, classes of stars in an astronomical survey).\n", "We start from the following true distribution $\\vec{f}$ of 3000 events in the 7 bins:\n", "\n", "| 1 | 2 | 3 | 4 | 5 | 6 | 7 |\n", "| :-- | :-- | :-- | :-- | :-- | :- - | :-- |\n", "| 35 | 218 | 814 | 1069 | 651 | 195 | 18 |\n", "\n", "In our experiment some of the events are misclassified.\n", "For an event which truly belongs to bin $i$, there is a 30% chance each that it is measured instead in bin $i-1$ or bin $i+1$ (except if the measured bin would be outside of the possible range from 1 to 7).\n", "Consequently, the migration matrix $R$ has 0.3 in all elements next to the diagonal, and 0.4 in the diagonal elements except for the corner elements which are 0.7.\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from matplotlib import pyplot as plt" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Here are some nice colors for your illustrations:\n", "green = '#009682'\n", "blue = '#4664aa'\n", "maygreen = '#8cb63c'\n", "yellow = '#fce500'\n", "orange = '#df9b1b'\n", "brown = '#a7822e'\n", "red = '#a22223'\n", "purple = '#a3107c'\n", "cyan = '#23a1e0'\n", "black = '#000000'\n", "light_grey = '#bdbdbd'\n", "grey = '#797979'\n", "dark_grey = '#4e4e4e'\n", "\n", "clrs = [\n", " blue,\n", " maygreen,\n", " orange,\n", " green,\n", " cyan,\n", " red,\n", " grey\n", "]" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Number of Categories/Bins. Transfer matrix will be n_bins x n_bins\n", "n_bins = 7\n", "n_events = 3000 \n", "\n", " # True event vector\n", "f = np.array([35., 218., 814., 1069., 651., 195., 18.])\n", "\n", "# TODO: Define the migration matrix" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, let's consider an ideal experiment without any uncertainties:\n", "\n", "**a) Obtain the observed distribution of events $\\vec{g}$.\n", "Then, unfold the observation by multiplication with $R^{-1}$, and compare the result to the true data.\n", "For this, plot the true distribution, the observed distribution, and the unfolded distribution.**" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# TODO: Fold the true event vector and unfold it again.\n", "\n", "# TODO: Plot the three histograms to illustrate the effects of folding and unfolding on the data!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`# TODO:` Explain what you observe using the markdown syntax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we consider a more realistic case with uncertainties on the observed events (e.g., Poissonian uncertainties in the case of particle decays).\n", "Thus, the number of observed events deviates from the ideal case, and it is more difficult to reconstruct the original distribution.\n", "In our specific case, the experiment has observed the following distribution $\\vec{g}_\\mathrm{obs}$ for the true distribution $\\vec{f}$ given above:\n", "\n", "\n", "| 1 | 2 | 3 | 4 | 5 | 6 | 7 |\n", "| :-- | :-- | :-- | :-- | :-- | :- - | :-- |\n", "| 99 | 386 | 695 | 877 | 618 | 254 | 71 |\n", "\n", "\n", "**b) Unfold the observation by multiplication with $R^{-1}$, and compare the true, observed, and unfolded distribution by plotting them together.\n", "Which problems do you encounter?**" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# Measurement including uncertainties\n", "g_obs = np.array([99., 386., 695., 877., 618., 254., 71])" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# TODO: Try this approach to unfolding on the measured data with uncertainties and illustrate the results using matplotlib." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`# TODO:` Explain what you observe using the markdown syntax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Such unphysical *oscillations* in the unfolded result are typical for many unfolding techniques.\n", "Suppressing them is a major challenge when unfolding real data. One way to achieve this is regularization.\n", "There are various sophisticated methods for regularization (cf. lecture).\n", "Here is a simple method which can be programmed easily in Python:\n", "\n", "- Diagonalize the migration matrix $R$: $R_\\mathrm{diag} = U^{-1} R U$ such that the eigenvalues $\\lambda_i$ of $R$ form the diagonal of $R_\\mathrm{diag}$.\n", "- Construct the observation vector $\\vec{g}_\\mathrm{diag} = U^{-1} \\vec{g}_\\mathrm{obs}$ and multiply each component $i$ of the resulting vector with the corresponding component of $\\vec{\\lambda}^{-1}$,\n", " where $\\vec{\\lambda}^{-1}$ is a vector containing the reciprocals of the eigenvalues of $R$.\n", "- The *regularization*: Set all elements $i$ of $\\vec{g}_\\mathrm{diag}$ to 0, for which $\\lambda_i$ is smaller than a chosen threshold $\\lambda_\\mathrm{reg}$.\n", " The choice of $\\lambda_\\mathrm{reg}$ is critical and a compromise between suppressing the unphysical oscillations, and keeping as much information as possible of the measurement.\n", "- The unfolded result is $U \\cdot \\vec{g}_\\mathrm{diag}$.\n", "\n", "\n", "**c) Apply the regularization method. Use different thresholds $\\lambda_\\mathrm{reg}$ from -1 to 1.\n", "As a cross-check: if the algorithm is implemented correctly, the result should be identical to the one of exercise 8.2 b), provided that $\\lambda_\\mathrm{reg}$ is smaller than the smallest eigenvalue of $R$.\n", "Discuss different choices for $\\lambda_\\mathrm{reg}$: as measure for how similar the true and the unfolded distributions are, calculate the mean quadratic deviation (average over all bins).\n", "For which choice of $\\lambda_\\mathrm{reg}$ is the unfolded distribution most similar to the true distribution?**" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def do_unfolding_with_regularization(lamb_regu):\n", " pass\n", " # TODO: Implement unfolding with regularization in this function with the parameter lamb_regu to provide the threshold." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# TODO: Try the regularization method for different threshold values and try to optimize the mean quadratic deviation between the true and unfolded data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`# TODO:` Explain what you observe using the markdown syntax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we will apply an iterative method for unfolding.\n", "We start with a flat assumption for $\\vec{f_0}$, i.e., each element is 3000 / 7 in our case, since we have 7 bins and 3000 events.\n", "Then, we fold $\\vec{f_0}$ by multiplication with $R$ and compare the resulting $\\vec{g_0}$ with the observation $g_\\mathrm{obs}$.\n", "Depending on the discrepancy between $g_0$ and $g_\\mathrm{obs}$, we *tune* our next guess for the true distribution $\\vec{f_1}$ and repeat the procedure to obtain $\\vec{g_1}$.\n", "If the *tuning* is reasonable, the guessed distribution will become closer to the the true distribution with each step.\n", "However, in the presence of uncertainties, too many iteration can lead to unreasonable results.\n", "Thus, choosing the number of iterations is again a challenge, similar to choosing the best threshold $\\lambda_{reg}$ in the regularization method.\n", "\n", "We will apply the following unfolding algorithm to improve our guess from iteration step $k$ to step $k+1$ (in a simplified version for symmetric response matrices):\n", "\n", "\n", "1) $\\vec{g}_{k+1} = R \\cdot \\vec{f}_k$\n", "\n", "2) Tuning: calculate a vector $\\vec{c}$ with weights to scale each bin $i$: $c^i =g_\\mathrm{obs}^i / g_{k+1}^i$.\n", " Consequently, the weight for bin $i$ is 1 if the observation is already reproduced by the result of step $k$.\n", " \n", "3) Calculate $\\vec{f}_{k+1}$: Multiply each element of $ \\vec{f}_k$ with the corresponding element of the vector $R \\cdot \\vec{c}$.\n", "\n", "\n", "**d) Apply the iterative method on the distribution you would observe without any experimental uncertainties.\n", " How many iterations do you need to achieve a maximum deviation of 0.1% per bin between the true distribution and the unfolded observation?**" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def do_iterative_unfolding(g_in, max_iterations):\n", " pass\n", " # TODO: Implement the iterative approach to unfolding in this function.\n", " # The parameter g_in can be used to provide the idealy folded events as input, or the measured events with uncertainty.\n", " # max_iterations shall be used to define the maximal number of interations used by the algorithm." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# TODO: Try the iterative method for different maximal numbers of iterations to unfold the ideal data set." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`# TODO:` Explain what you observe using the markdown syntax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**e) Now, apply the method on the observation with uncertainties $\\vec{g}_\\mathrm{obs}$.\n", " Try and discuss different choices for the number of iterations.**" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "# TODO: Apply the iterative method for different maximal numbers of iterations to unfold the measured data set with uncertainties g_obs. Try to minimize the mean quadratic deviation!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`# TODO:` Explain what you observe using the markdown syntax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**f) The choice of $\\vec{f_0}$ is similar to the choice of a prior in a Bayesian analysis and influences the result.\n", " Try different choices for $\\vec{f_0}$ and test the influence on the result for small and large number of iterations.**" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# Different options for priors of f_start:\n", "prior1 = np.array([1.,20.,300.,400.,300.,20.,1.])\n", "prior2 = np.array([4.,3.,2.,1.,2.,3.,4.])" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# TODO: Extend the function do_iterative_unfolding so that a prior can be provided via an argument. This prior should than be used as starting vector for f_0." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "# TODO: Apply this extended function on the measurements with uncertainties using different priors and max_iterations values to optimize the mean quadratic deviation!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`# TODO:` Explain what you observe using the markdown syntax" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.6" } }, "nbformat": 4, "nbformat_minor": 4 }