Voodoo Math

Essays | Max James Rounds | Friday, January 1, 2010 | 1 Comment

On April 2, 2007, New Century Financial Corporation, a major subprime mortgage lender, filed for Chapter 11 bankruptcy, introducing the word ‘subprime’ into the popular English lexicon. New Century’s failure was one of the early signs that something was horribly wrong with the US housing market. Over the next two years, the problems with American real estate would eventually claim (in one form or another) Countrywide Financial Corporation, Bear Stearns, Fannie Mae, Freddie Mac, Lehman Brothers, Merrill Lynch, AIG, Washington Mutual, Wachovia, and countless other smaller institutions

Much of the reason for the demise of these corporations centered on the devaluation of financial products known as mortgage-backed securities (MBS) and collateralized debt obligations (CDOs), two investments that gained popularity beginning in 2000.  These two products were essentially pools of (often subprime) home loans that investors could buy in return for a piece of the monthly payments. The borrowers of subprime mortgages were, in most cases, at a high risk of default. Because banks were nervous about recouping their investment, borrowers were obligated to pay higher interest rates on their mortgages, making MBS, and CDOs lucrative investments. However, this also meant that subprime mortgages were extremely risky. Indeed, when the value of these investments plummeted in 2009, the institutions that owned them became insolvent or were taken over immediately prior to insolvency.

A year after the financial crisis began, I was having dinner with my father when he asked me a question about subprime mortgages, MBS and CDOs. He was trying to understand how it had ever seemed like a good idea to package extremely risky mortgages and sell them to investors. In response, I focused on what I perceived to be the cornerstone of the appeal of MBS and CDOs: correlation.

From the standpoint of subprime mortgages, correlation represents the tendency of borrowers to default at the same time. As long as the correlation of default among these homeowners is relatively small, the high mortgage payments compensate for the high risk of default. The danger with such an approach, however, is that the correlation among defaults can increase, ensuring that lenders do not recoup a substantial portion of their investment.

I told my father that these products came to life in a time when defaults were relatively uncorrelated, and that given the assumption that this behavior would continue into the future, MBS and CDOs were great investments that offered investors a higher rate of return and allowed those otherwise unable to own a home a chance to grab a piece of the American Dream for themselves. As borrowers began to default in ever-greater numbers, however, the assumption of uncorrelated defaults failed, and the products showed their inherent riskiness as they precipitated the global financial crisis.

But how did Wall Street come to make this mistake in the first place? Before we can answer this question, we must first better understand the concepts and problems of modern investing.

THE ASSET ALLOCATION PROBLEM

Consider the problem of any investor in the stock market: for a portfolio of risky assets (typically stocks), how can I maximize the return on my money for a given level of risk that I am willing to take? This problem was first addressed by Harry Markowitz in 1952 and later expanded by William Sharpe in 1964; together their writings form the basis of a field known as modern portfolio theory. Markowitz and Sharpe’s solution depended on three parameters: the average return on a stock, the volatility of that return, and the correlation of that return to another stock.

In statistics, a parameter is a “description” of a numeric population. For example, mean, median, and standard deviation are all statistical parameters. However, we almost never know the true value of parameters – we can only guess using statistics.  The first parameter in question, average return, is the ratio of money gained or lost relative to the amount of money invested. If I put $100 in the bank and I earn $5 in interest over one year, my rate of return is 5%. While the average return on a stock is a reasonably intuitive concept, the volatility and correlation are much less so and warrant further explanation.

Intuitively, volatility is the distance of the actual returns (those that appear in the real world) from the returns that we predicted. Formally, it is the standard deviation of the return of an asset. It is commonly used as a measure of risk; the greater volatility, the less sure an investor can be about his asset’s performance. For those seeking to minimize their risk, one approach is to minimize the volatility of the expected return of their investments.
The third parameter, correlation, measures the strength and direction of a linear relationship between two variables. For example, if the share price of Google and Apple tend to rise and fall together, these two variables exhibit a positive correlation. Conversely, if the share price of Dell tends to fall when the share price of Apple rises, these two stocks exhibit a negative correlation. Correlations always lie between -1 and 1, a perfectly negative and perfectly positive correlation, respectively. It is important to keep in mind, however, that correlation does not imply anything about the degree of the movements. For instance, if a Toyota Camry loses 10% each year in value and a Volkswagen Beetle loses 15%, their correlation will be 1 in spite of the fact that the cars lose value at different rates.

The power of the solution devised by Markowitz and Sharpe can be illustrated through a simple diversification example. Suppose you are offered a choice of investment portfolios: only gold, only the stock market (proxied by the S&P 500), or some mix of the two. Using the year 2006 as an example, a portfolio of just the S&P 500 had an average annualized daily return of 11.7%, while a portfolio of just gold returned an average of 20.4%. While gold outperformed the stock market, it had its own drawback: risk. The average annualized daily volatility of the S&P 500 portfolio returns was 9.9%.  Gold, in contrast, exhibited a volatility of 23.9%. So while gold performed about twice as well as the stock market, it was over twice as risky.

Diversification shows its teeth once we look at the performance of portfolios of both assets combined. During the year 2006, the returns of the S&P 500 and gold had a correlation of only 0.16, implying that they had a weak relationship. With this in mind, we could have achieved portfolios  with the characteristics of the figure below:

[singlepic id=60 w=320 h=240 float=center]

The dashed region describes the portfolios that have both a higher return and a lower volatility than the S&P 500. These particular mixes of the S&P 500 and gold offer the investor the best of both worlds, earning more money with less risk. In particular, Portfolio A, a mix of 90% market and 10% gold, had a return of 12.6% and a volatility of only 9.6% (versus the S&P 500, which had a return of 11.7% and volatility of 9.9%). This was possible because of the correlation between gold and the S&P 500: when two assets have a weak or negative correlation, their randomness tends to “cancel out,” allowing for decreased risk.

So why haven’t financial advisers lost their jobs and left asset allocation to computers? Unfortunately, the previous calculations were done with the benefit of hindsight. It is much more difficult (some might say impossible) to find statistics that accurately describe the future behavior of a stock given past observations. In order to do this, we need more sophisticated statistical tools.

Suppose you want to estimate the return on the S&P 500 in 2010. One option is to estimate the return using data from the past year. For example, from May 4, 2008 to May 4, 2009, the S&P 500 had an average annualized daily return of -34%. But how sure can we be that this statistic will accurately predict the return in 2010? In order to determine the reliability of our estimate, we could construct a confidence interval. For instance, a 95% confidence interval implies that we can say “with 95% confidence” that the true average return lies within the interval. Consequently, the smaller the interval, the more certain we can be about our estimate. In this spirit, the 95% confidence interval of data from 2008 is -122% to 55%. In other words, we can say with 95% confidence that the return in 2009 on the S&P 500 will lie between -122% to 55%. This is a fairly useless result.

To improve our estimation, we could use more data in order to reduce the width of the confidence interval. Extending the window of data back five years provides a 95% confidence interval of -22% to 19%. This estimate is more reliable, but would you bet your child’s college education or your retirement on these numbers?

We could include even more data to increase the reliability of our estimate, but here we arrive at the great paradox of investing. As the amount of data increases, the width of the confidence interval decreases, thus increasing the reliability of the statistic. However, adding more data points usually entails adding more, but irrelevant data points. Does the average return in 1999 really help us predict the average return in 2009? This is one of the key problems of estimating parameters from financial time series: there is a reliability/relevance tradeoff. As we include more data, we become more certain about a statistic that is less useful.

Moreover, we have relied this entire time on the underlying assumption that the world is static. In order to use these estimation procedures, we must first assume the type of randomness remains constant. If we were flipping a weighted coin, such an assumption would be reasonable; an unfair quarter is the same in every toss. Estimating the parameters of a stock, however, is akin to estimating the odds of the coin when those odds change with each flip. The underlying style of the randomness of the stock market is constantly shifting. Each day, world news changes the return, volatility, and correlation of each stock, often making the estimation of its parameters an exercise in futility.

THE PUNCH LINE: PARAMETER ESTIMATION AS IT PERTAINS TO DERIVATIVES

The 2008 financial crisis has given financial instruments known as derivatives a bad name. What are they exactly? Derivatives are investments whose value depends on the price of other financial products. Derivative pricing models often rely on parameters to determine their worth just as the solution to the asset allocation problem requires the returns, volatilities, and correlations of the assets within. The derivatives that have consistently made the news during the current financial crisis include mortgage-backed securities (MBS) and collateralized default obligations (CDOs); the value of these two investments is determined by a pool of mortgages that pay out interest, much like a bond. These products function in a similar fashion to a portfolio of stocks, but instead of the correlation between asset returns, loan portfolios must take into account the correlation of default risk.

These pools of mortgages attempt to diversify away the risk of default. For instance, if one loan defaults in a pool of 100, there are still 99 loans left. With this in mind, MBS and collateralized debt obligations can be excellent investments, assuming only a few of the underlying mortgages default. The tendency of loans to default together is referred to as their default correlation, and it functions just like the correlation between gold and the S&P 500. If the default correlations in a pool of mortgages rise, the investor may see many more than just one or two mortgages go under, which in turn would vastly decrease the value of the MBS and CDOs holding those loans. As a result, without accurate estimates of default correlations, investors will stand to lose vast sums of money.

The consequences of rising default correlations can be dire for a pool of loans. Unlike a stock, a loan entails that investors either get their money back (no default) or do not get any money (default). Rising default correlations thus imply that it is becoming more and more likely that a lender loses his investment. For example, assume you loan 10 friends $1,000 each. Say that when you make the loan, times are good, your friends have steady jobs, and you are not worried about their ability to repay you. One might say your friends have a low correlation of default, as what makes one friend default (unexpected injury, for instance) will not cause the others to do so as well. Now, say the economy worsens. Perhaps several of your friends lose their jobs. As a lender, you are now in danger of losing your money, since a few your friends may not be able to pay you back. In the end, you might only get $6000 out of the original $10,000. The tendency of your friends to default together is characterized by a rise in their default correlations. Imagine, however, that you were still using the assumption of low default correlations to assess your own financial situation. You would be sorely misinformed. Furthermore, once you realized your error, it would already be too late. No one would want to buy the risky loans from you under so much uncertainty. This is exactly the situation in which holders of MBS and CDOs found themselves as the real estate market began to worsen in late 2006 and early 2007.

It is therefore extremely important for the accurate pricing of a pool of mortgages to obtain accurate default correlations. Like volatility, correlation is not directly observable. There are no daily correlation figures we can derive and then average. Instead, an investor trying to get a sense of the correlation between two sources of randomness must look at a time series and make an inference. This works moderately well with two stocks since there are daily prices that are readily available. With the likelihood of default, however, this can be extremely daunting. A mortgage has never defaulted until it defaults, and thus for an individual mortgage, there is no time series of defaults that can produce a correlation metric.

CONCLUSION

In many ways, the popularity of MBS and CDOs, beginning in 2000, precipitated the demand for subprime mortgages, only to be crushed by rising defaults eight years later. They were sold on the assumption of low default correlations and higher returns than bonds of supposedly comparable risk. At the time MBS and CDOs were created, assumptions of low default correlations very well may have been accurate; however, as the adjustable-rate subprime mortgages underlying them reset to higher rates, defaults became much more correlated, and the value of the credit derivatives declined to almost nothing.

The disaster that ensued has prompted a massive re-evaluation of credit models, default models, and correlation estimation algorithms. In more recent research, the phrase non-parametric estimate comes up again and again as an approach to deriving useful information from data without some of the massive pitfalls that arise from direct parameter estimation techniques. Despite this, the use of past data to predict future events will always have its limits. In the words of Nassim Nicholas Taleb, a famous quantitative trader: “Co-association between securities is not measurable using correlation. Anything that relies on correlation is charlatanism.”

Comments
One Response to “Voodoo Math”
  1. Doug Kahn says:

    The initial marketability of the CDOs themselves also changed the mortgage loan industry. A substantial majority of high-risk mortgages in the last 5 years were developed by loan brokers, because they discovered they could resell almost any loan to the entities creating the CDOs. So previous assumptions about how default-prone any individual mortgage was became obsolete very quickly.

Leave A Comment

About The Claw

As our former editor-in-chief once said: if The New Yorker and The Atlantic had a bastard child, it would be The Claw Magazine. We hope one day this will be true. Until then, we are a Stanford student publication that supports and showcases the Farm’s rich culture in politics, humanities, and the arts. In this spirit, we publish investigative reporting, columns, essays, fiction, fine art, doodles, and everything in between.