In this blog post, I will present some backtest results on volatility models. The list I present here are not exhaustive and there are still a gargantuan set of papers focusing on this issue (a good place to start is on vlab). In the next section, I present some simple notations to define financial volatility and then define each model and show general backtest results with risk attributes. The premise of the backtest is as follows: financial volatility of an investment portfolio is able to be minimized globally through allocating the correct amount of dollar toward each asset within the portfolio. The backtest involves constructing a portfolio each week, given a fixed list of assets which in our case, will be the nine S&P 500 sector etfs. The financial volatility of each asset will need to be forecasted/estimated for the next week and thus, the backtest exercise is to find what has historically been the best out-of-sample volatility model through comparing the performance of each constructed portfolio. The backtest runs weekly from Jan 1 2005 until June 23, 2015. For each backtest, trading slippage and commission are ignored since they introduce more noise into the results.
The list of volatility estimators are all applied in a multivariate setting, many of which with simplifying assumptions. Before introducing them, we need to define some basic notations. Let be a simple return or where is the daily close price at time . Volatility, something inherently observable, is best proxied by the statistical measure variance where, given a finite sample, can be estimated as where or the sample mean. In a multivariate setting, an asset has its own variance and also exhibits a degree of covariance with other assets. These two information can be condensed into a covariance matrix where is a T by N matrix such that each row is a daily return observation. It is worth noting here that for variables that do not contain or reference the subscript, it is assumed to be time invariant.
(1) Sample Covariance
In this setting, the covariance matrix is estimated via a traditional sample estimate.
where and is a T by 1 vector. The estimate can be derived via maximum likelihood for the normal distribution and is said by statisticians to be an unbiased and consistent estimate.
(2) Constant Correlation Model
The covariance matrix can be decomposed into a correlation matrix where each off-diagonal entry is the correlation (0-1) between two assets. The argument for a constant correlation model stems from the intuition that sample correlations are too noisy over the short-term and thus better estimates can be obtained by taking the mean of all pairwise sample correlations. After we know the correlation structure, only the standard deviation is left to forecast. In this model, I try two approaches:
- Sample standard deviation or simply where
- Forecasting from a GARCH(1,1) Process where variance is time-varying and captures the autocorrelation structure of variances. The GARCH(1,1) (with variance targeting) for an individual stock is specified as such:
where is the unconditional variance and is equal to the last period demeaned return, often the sample estimate. For each period in the backtest, the GARCH(1,1) is fitted and the standard deviation is forecasted one period ahead. The parameters are found via numerical QMLE maximization under the assumption of normal returns or
(3) DCC, DECO & CCC
While on the topic of GARCH, more complex time-varying multivariate models have been proposed, two of the most popular being DCC (Dynamic Conditional Correlation) and DECO (Dynamic Equicorrelation). For DCC and DECO, both the individual volatility and correlations are allowed to vary over time and capture the autocorrelation structure of both. The DCC(1,1), much like GARCH(1,1) is specified as follows:
where is the conditional correlation matrix, is a vector of demeaned and standardized asset returns and is the unconditional correlation matrix. Estimation of the DCC is more involved but often split into a two step process: 1) Conditional asset variances are estimated via standard GARCH(p,q) process into a diagonal matrix . 2) Conditional correlation matrix is estimated with the demeaned returns standardized or where is a vector of demeaned asset returns.
In short, DECO follows the same structure as DCC except the conditional correlation matrix follow the constant correlation model with time-varying mean correlations. See above link for more details.
Similarly, the CCC (Constant Conditional Correlation) model is like DCC except Correlations do not vary over time and calculated using the above steps of first standardizing each demeaned returns with conditional standard deviations and then calculating the sample correlation matrix
(4) Single-Factor Model
Using the overall market return as a return generating process , the variance of each individual asset becomes where is the market variance and is the variance of model innovations. Furthermore, the covariance between each pair of assets is restricted to its market fluctuations .
(5) Shrinkage Models: LDW & OAS
The Ledoit-Wolf and Oracle Approximating Shrinkage are both covariance estimators designed to handle the issue small samples, large N as well as smooths the noise from the sample by introducing a prior. I do not know much about OAS, it was introduced as an improvement over LDW on the basis of MSE for a normal distribution. Implementation was quite easy so I threw it into the mix. For LDW, the prior is specified as the constant correlation model and the shrunk covariance matrix is equal to where is the sample covariance estimate. is estimated through minimizing the expected quadratic loss between the shrunk matrix and actual covariance matrix. The authors derive, under finite moments and i.i.d assumption, a closed-form equation of the shrinkage parameter consists of three parts where is the sum of the variance of each sample covariance entry, is the sum of the covariance between each entry for both sample and prior matrix and lastly, is the sum of the squared departures from the prior and the sample estimate. If we for now, assumed that sample estimate doesn't correlate with the prior, then the intensity can be intuitively thought of as the ratio of the error of sample covariance to the error of the prior. The higher the variance of the sample covariance entries, more is shrunk toward the prior and vice versa for the prior error.
(6) Exponential Weighted Covariance Matrix (RiskMetrics)
In 1996, JPMorgan released a large technical document on analysing and quantifying risk. The document itself is how the popularized risk measure, value at risk, came to be. Within the document, the authors proposed using a exponentially time-weighted strategy to estimate the covariance matrix under the intuition that more recent observations contain more relevant information than older observations for the next period's forecast, since older information were created under different economic environments and states. More on the formula can be read here. Implementation was quite simple through using the ewmcov function on pandas. In the RiskMetrics documentation, a of 0.94 was suggested for daily returns. Since we are backtesting on a weekly time-frame, I revised down to approximately 0.987 which is calculated from the span parameter times 5 on pandas documentation.
To summarize, I present all ordered model results (ranked by volatility) in the table below. Alongside volatility, I also calculated the historical VaR, Beta, Max Drawdown, average weight of each sectors and the Herfindahl-Hirschman Index (a measure of portfolio diversification normalized by portfolio leverage).
Surprisingly, in forward testing, the more complex models tend to do the worst e.g Dynamic, shrinkage models while more parsimonious models perform the best e.g Constant Correlation models and Single-factor. The sample estimate also does perform surprisingly well coming in fourth place. VaR measures were fairly consistent over each model. The best models were able to reduce their volatility to nearly half of the market and surprisingly, DCC was able to dominate in this area as well. Likewise, large draw-downs were best mediated by the DCC model which I believe is probably due to its ability to anticipate the autocorrelated correlation changes in 2008/2009. The HHI index were varied across all models with no significance. Lastly, most models were on average short the financial, energy and utilities sectors while being incredibly long on consumers and industrials.
Hope you enjoyed this post. Stay tuned for more 🙂