Almost regardless of your view about the predictability or efficiency of markets, you’ll probably agree that asset returns are uncertain or risky. This is with rare exception. If we ignore the math that underlies probability distributions, we can see they are pictures that describe a particular view of uncertainty.
Uncertainty refers to randomness; it is different from a lack of predictability, or market inefficiency. An emergent research view holds that financial markets are both uncertain and predictable. Also, markets can be efficient but also uncertain. In finance, we use probability distributions to draw pictures that illustrate our view of an asset return’s sensitivity when we think the asset return can be considered a random variable. In this article, we’ll go over a few of the most popular probability distributions and show you how to calculate them.
What Are They?
Discrete refers to a random variable drawn from a finite set of possible outcomes. A six-sided die, for example, has six discrete outcomes. A continuous distribution refers to a random variable drawn from an infinite set. Examples of continuous random variables include speed, distance and some asset returns. A discrete random variable is illustrated typically with dots or dashes, while a continuous variable is illustrated with a solid line. Figure 1 shows discrete and continuous distributions for a normal distribution with mean (expected value) of 50 and standard deviation of 10:
The distribution is an attempt to chart uncertainty. In this case, an outcome of 50 is the most likely but only will happen about 4% of the time; an outcome of 40 is one standard deviation below the mean and it will occur just under 2.5% of the time.
The other distinction is between the probability density function and the cumulative distribution function.
The PDF is the probability that our random variable reaches a specific value (or in the case of a continuous variable, of falling between an interval). We show that by indicating the probability that a random variable ‘X’ will equal an actual value ‘x’:
The cumulative distribution is the probability that random variable ‘X’ will be less than or equal to actual value ‘x’:
For example, if your height is a random variable with an expected value of 5’10” inches (your parents’ average height), then the PDF question is, “What’s the probability that you will reach a height of 5’4″?” The corresponding cumulative distribution function question is, “What’s the probability you’ll be shorter than 5’4″?”
Figure 1 showed two normal distributions. You can now see these are probability density function (PDF) plots. If we re-plot the exact same distribution as a cumulative distribution, we’ll get the following:
The cumulative distribution must eventually reach 1.0 or 100% on the y-axis. If we raise the bar high enough, then at some point, virtually all outcomes will fall under that bar (we could say the distribution is typically asymptotic to 1.0).
Finance, as a social science, is not as clean as physical sciences. Gravity, for example, has an elegant formula that we can depend on, time and again. Financial asset returns, on the other hand cannot be replicated so consistently. A staggering amount of money has been lost over the years by clever people who confused the accurate distributions (i.e., as if derived from physical sciences) with the messy, unreliable approximations that try to depict financial returns. In finance, probability distributions are little more than crude pictorial representations.
Types of Distributions
The simplest and most popular distribution is the uniform distribution, in which all outcomes have an equal chance of occurring. A six-sided die has a uniform distribution. Each outcome has a probability of about 16.67% (1/6). Our plot below shows the solid line (so you can see it better), but keep in mind that this is a discrete distribution – you can’t roll 2.5 or 2.11:
Now roll two dice together, as shown in Figure 4, and the distribution is no longer uniform. It peaks at seven, which happens to have a 16.67% chance. In this case, all the other outcomes are less likely:
Now roll three dice together, as shown in Figure 4. We start to see the effects of a most amazing theorem: the central limit theorem. The central limit theorem boldly promises that the sum or average of a series of independent variables will tend to become normally distributed, regardless of their own distribution. Our dice are individually uniform but combine them and – as we add more dice – almost magically their sum will tend toward the familiar normal distribution!
The binomial distribution reflects a series of “either/or” trials, such as a series of coin tosses. These are called Bernoulli trials – which refer to events that have only two outcomes – but you don’t need even (50/50) odds. The binomial distribution below plots a series of 10 coin tosses where the probability of heads is 50% (p-0.5). You can see in Figure 6 that the chance of flipping exactly five heads and five tails (order doesn’t matter) is just shy of 25%:
If the binomial distribution looks normal to you, you are correct about that. As the number of trials increase, the binomial tends toward the normal distribution.
The lognormal distribution is very important in finance because many of the most popular models assume that stock prices are distributed lognormally. It is easy to confuse asset returns with price levels:
Asset returns are often treated as normal – a stock can go up 10% or down 10%. Price levels are often treated as lognormal – a $10 stock can go up to $30 but it can’t go down to -$10. The lognormal distribution is non-zero and skewed to the right (again, a stock can’t fall below zero but it has no theoretical upside limit):
The Poisson distribution is used to describe the odds of a certain event (e.g., a daily portfolio loss below 5%) occurring over a time interval. So, in the example below, we assume that some operational process has an error rate of 3%. We further assume 100 random trials; the Poisson distribution describes the likelihood of getting a certain number of errors over some period of time, such as a singe day.
The student’s T distribution is also very popular because it has a slightly “fatter tail” than the normal distribution. The student’s T is used typically when our sample size is small (i.e. less than 30). In finance, the left tail represents the losses. Therefore, if the sample size is small, we dare underestimate the odds of a big loss. The fatter tail on the student’s T will help us out here. Even so, it happens that this distribution’s fat tail is often not fat enough. Financial returns tend to exhibit, on rare catastrophic occasion, really fat-tail losses (i.e. fatter than predicted by the distributions). Large sums of money have been lost making this point.
Finally, the beta distribution (not to be confused with the beta parameter in the capital asset pricing model) is popular with models that estimate the recovery rates on bond portfolios. The beta distribution is the utility player of distributions. Like the normal, it needs only two parameters (alpha and beta), but they can be combined for remarkable flexibility. Four possible beta distributions are illustrated in Figure 10 below:
The Bottom Line
Like so many shoes in our statistical shoe closet, we try to choose the best fit for the occasion, but we don’t really know what the weather holds for us. We may choose a normal distribution then find out it underestimated left-tail losses; so we switch to a skewed distribution, only to find the data looks more “normal” in the next period. The elegant math underneath may seduce you into thinking these distributions reveal a deeper truth, but it is more likely that they are mere human artifacts. For example, all of the distributions we reviewed are quite smooth, but some asset returns jump discontinuously.
The normal distribution is omnipresent and elegant and it only requires two parameters (mean and distribution). Many other distributions converge toward the normal (e.g., binomial and Poisson). However, many situations, such as hedge fund returns, credit portfolios and severe loss events, don’t deserve the normal distributions.