Musings on the Gamma Distribution

Statistics

Published

May 1, 2024

Abstract

In this post we demonstrate the derivation of the Gamma distribution through two different approaches. First, we show how the sum of \(n\) independent exponentially distributed random variables can be derived iteratively using convolution integrals. Starting with the case of \(n=2\), we explicitly calculate the probability density function and cumulative distribution function, then extend the result to \(n=3\) and generalize to arbitrary \(n\). Second, we present an elegant alternative derivation using Laplace transforms, leveraging their properties to convert convolutions into multiplications in \(s\)-space. Both methods arrive at the same result, showing that the sum of \(n\) independent exponentially distributed random variables is Gamma distributed.

Keywords

exponential, poisson, gamma

\(\require{cancel}\)

\(S_n=\sum_{i=1}^n T_i\) is a sum of \(n\) random numbers. It is illustrative to consider \(n=2\) case and figure out the distribution of the sum of two random numbers \(T_1\) and \(T_2\). The cumulative probability density of \(S_{2}\equiv T_1+T_2\) is given by: \[\begin{eqnarray} F_{S_{2}}(t)&=&P(T_1+T_2< t)=\int_{t_1+t_2<t}f_{T_1}(t_1)f_{T_2}(t_2) dt_1 dt_2=\int_{-\infty}^\infty\int_{-\infty}^{t-t_2}f_{T_2}(t_2)dt_2f_{T_1}(t_1) dt_1\nonumber\\ &=&\int_{-\infty}^\infty F_{T_2}(t-t_1)f_{T_1}(t_1) dt_1. \label{eq:conv} \end{eqnarray}\]

The probability density function is the derivative of Eq. \(\ref{eq:conv}\): \[\begin{eqnarray} f_{S_{2}}(t)&=&\frac{d}{dt} F_{S_{2}}(t)=\int_{-\infty}^\infty f_{T_2}(t-t_1)f_{T_1}(t_1) dt_1=\int_{0}^t f_{T_2}(t-t_1)f_{T_1}(t_1) dt_1, \label{eq:convpdf} \end{eqnarray}\] where the limits of the integral are truncated to the range where \(f\neq0\). The integral in Eq.\(\ref{eq:convpdf}\) is known as the convolution integral: \[\begin{eqnarray} f_{T_1}\circledast f_{T_2}\equiv \int_{-\infty}^\infty f_{T_2}(t-t_1)f_{T_1}(t_1) dt_1, \label{eq:convdef} \end{eqnarray}\] In the special case of exponential distributions, \(f\) is parameterized by a single parameter \(\lambda\), which represents the failure rate, and it is given by \[\begin{eqnarray} f_{T}(t)=\lambda e^{-\lambda t},\,\, t > 0. \label{eq:rexp} \end{eqnarray}\]

From Eq. \(\ref{eq:convpdf}\) we get: \[\begin{eqnarray} f_{S_{2}}(t)&=&\int_{0}^t f_{T_2}(t-t_1)f_{T_1}(t_1) dt_1=\lambda^2 \int_{0}^t e^{-\lambda (t-t_1)}e^{-\lambda t_1} dt_1=\lambda^2e^{-\lambda t} \int_{0}^t dt_1=\lambda^2\, t \,e^{-\lambda t}, \label{eq:convpdfexp} \end{eqnarray}\] which is actually a \(\Gamma\) distribution. The corresponding cumulative failure function is: \[\begin{eqnarray} F_{S_{2}}(t)&=&\int_0^t d\tau f_{S_{2}}(\tau)=\lambda^2\int_{0}^td\tau\, \tau \,e^{-\lambda \tau}=-\lambda^2\frac{d}{d\lambda}\left[\int_{0}^td\tau\,e^{-\lambda \tau}\right]=\lambda^2\frac{d}{d\lambda}\left[\frac{e^{-\lambda t}-1}{\lambda}\right]\nonumber\\ &=&1-e^{-\lambda t}(1+\lambda t). \label{eq:convcumfexp} \end{eqnarray}\] This is pretty neat. Can we move to the next level and add another \(T_i\), i.e., \(S_{3}=T_1+T_2+T_3=S_{2}+T_3\). We just reiterate Eq. \(\ref{eq:convpdf}\) with probability density for \(S_{2}\) from Eq. \(\ref{eq:convpdfexp}\).

\[\begin{eqnarray} f_{S_{3}}(t)&=&\int_{0}^t f_{T_3}(t-t_1)f_{S_{2}}(t_1) =\lambda^3 \int_{0}^t e^{-\lambda (t-t_1)} t_1 \,e^{-\lambda t_1} dt_1=\lambda^3\frac{t^2}{2}e^{-\lambda t}, \label{eq:convpdf3} \end{eqnarray}\] which was very easy! In fact, we can keep adding more terms. The exponentials kindly drop out of the \(t_1\) integral, and we will be simply integrating powers of \(t_1\), and for \(S_{n}\equiv T_1+T_2+\cdots +T_n\) to get: \[\begin{eqnarray} f_{S_{n}}(t)&=&\lambda^n\frac{t^{n-1}}{(n-1)!}e^{-\lambda t}. \label{eq:convpdfn} \end{eqnarray}\]

It will be fun if we redo this with some advanced mathematical tools, such as the Laplace transform, which is defined as: \[\begin{eqnarray} \tilde f(s)\equiv\mathcal{L}\big[f(t)\big]=\int_0^\infty dt \, e^{-s \,t} f(t) \label{eq:laplacedef}. \end{eqnarray}\] There are a couple of nice features of the Laplace transforms we can make use of. The first one is the mapping of convolution integrals in \(t\) space to multiplication in \(s\) space. To show this, let’s take the Laplace transform of Eq. \(\ref{eq:convdef}\):

\[\begin{eqnarray} \mathcal{L}\big[f_{T_1}\circledast f_{T_2}\big] = \int_0^\infty dt \, e^{-s \,t} \int_{-\infty}^\infty f_{T_2}(t-t_1)f_{T_1}(t_1) dt_1 = \int_{-\infty}^\infty dt_1 \int_0^\infty dt \, e^{-s \,(t-t_1) } f_{T_2}(t-t_1)e^{-s \,t_1 } f_{T_1}(t_1). \label{eq:convdefL} \end{eqnarray}\] Let’s take a closer look at the middle integral: \[\begin{eqnarray} \int_0^\infty dt \, e^{-s \,(t-t_1) } f_{T_2}(t-t_1) = \int_{-t_1}^\infty dt \, e^{-s \tau} f_{T_2}(\tau) = \int_{0}^\infty d\tau \, e^{-s \tau} f_{T_2}(\tau)= \tilde f_{T_2}(s), \label{eq:convdefLm} \end{eqnarray}\] where we first defined \(\tau=t-t_1\), and then shifted the lower limit of the integral back to \(0\) since \(f_{T_2}(t)=0\) for \(t<0\). Putting this back in, we have the nice property: \[\begin{eqnarray} \mathcal{L}\big[f_{T_1}\circledast f_{T_2}\big] = \tilde f_{T_1}(s) \tilde f_{T_2}(s). \label{eq:convdefLf} \end{eqnarray}\] How do we make use of this? The probability distribution of a sum of random numbers is the convolution of individual distributions: \[\begin{eqnarray} f_{S_{n}}=\underbrace{f_{T_1}\circledast f_{T_2}\circledast \cdots \circledast f_{T_n}}_{n \text{ times}}. \label{eq:nconvol} \end{eqnarray}\] We can map this convolution to multiplications in \(s\) space: \[\begin{eqnarray} \tilde f_{S_{n}}(s)\equiv\mathcal{L}\big[f_{S_{n}}\big] =\underbrace{\tilde f_{T_1} \tilde f_{T_2} \cdots \tilde f_{T_n}}_{n \text{ times}}=\displaystyle \prod_{j=1}^{n} \tilde f_{T_j}. \label{eq:condP3prod} \end{eqnarray}\] When the individual random numbers are independent and have the same distribution, we get: \[\begin{eqnarray} \tilde f_{S_{n}}(s) =\left(\tilde f_{T}\right)^n. \label{eq:condP3prodf} \end{eqnarray}\] If the random numbers are exponentially distributed, as in Eq. \(\ref{eq:rexp}\), their Laplace transformation is easy to compute: \[\begin{eqnarray} \tilde f(s)=\int_0^\infty dt \, e^{-s \,t}\lambda e^{-\lambda t} =\frac{\lambda}{s+\lambda}, \label{eq:laplaceofexp} \end{eqnarray}\] which means the Laplace transform of the sum is: \[\begin{eqnarray} \tilde f_{S_{n}}(s) =\left(\frac{\lambda}{s+\lambda} \right)^n. \label{eq:condP3prodfexp} \end{eqnarray}\] We will have to inverse transform Eq. \(\ref{eq:condP3prodfexp}\), which will require some trick. This brings us to the second nifty property of Laplace transform. Consider transforming \(t f(t)\):

\[\begin{eqnarray} \mathcal{L}\big[t f(t)\big]=\int_0^\infty dt \, t e^{-s \,t} f(t)=-\frac{d}{ds}\left[\int_0^\infty dt e^{-s \,t} f(t)\right] =-\frac{d}{ds}\left[\tilde f(s)\right] \label{eq:flaplacedef}. \end{eqnarray}\] Therefore, we see that Laplace transform maps the operation of multiplying with \(t\) to taking negative derivatives in \(s\) space: \[\begin{eqnarray} t \iff -\frac{d}{ds} \label{eq:flaplacedemap} \end{eqnarray}\] We re-write Eq. \(\ref{eq:condP3prodfexp}\) as: \[\begin{eqnarray} \tilde f_{S_{n}}(s) =\left(\frac{\lambda}{s+\lambda} \right)^n= \frac{\lambda^n}{(n-1)!}\left(-\frac{d}{ds} \right)^n \left(\frac{\lambda}{s+\lambda} \right). \label{eq:condP3prodfexp2} \end{eqnarray}\] Using the property in Eq. \(\ref{eq:flaplacedemap}\), we can invert the transform: \[\begin{eqnarray} f_{S_{n}}(t)=\mathcal{L}^{-1}\big[f_{S_{n}}\big]=&=&\lambda^n\frac{t^{n-1}}{(n-1)!}e^{-\lambda t} \label{eq:convpdfn2}, \end{eqnarray}\] which is what we got earlier in Eq. \(\ref{eq:convpdfn}\).

Other Formats