Risk has always played a very large role in the world of finance with the performance of a large number of investment and trading strategies being dependent on the efficient estimation of underlying market risk. With regards to this, one of the most popular and commonly used representation of risk in finance is through a covariance matrix – higher covariance values mean more volatility in the markets and vice-versa. This also comes with a caveat – empirical covariance values are always measured using historical data and are extremely sensitive to small changes in market conditions. This makes the covariance matrix an unreliable estimator of the true risk and calls for a need to have better estimators.
In this post we will use the RiskEstimators class from MlFinLab which provides several implementations for different ways to calculate and adjust the empirical covariance matrices. Some of these try to remove inherent outliers from the matrix while others focus on removing noise from the empirical values. Throughout this blog post, we will look at a quick description of each algorithm as well as see how we can use the corresponding implementation provided in MlFinLab.
The RiskEstimators class covers seven different algorithms relating to covariance matrix estimators. These algorithms include:
- Minimum Covariance Determinant
- Empirical Covariance
- Covariance Estimator with Shrinkage
- Semi-Covariance Matrix
- Exponentially-Weighted Covariance matrix
- De-Noising Covariance Matrix
- Covariance and Correlation Matrix Transformations
Note: The descriptions of these algorithms are all based upon the descriptions from the scikit-learn User Guide on Covariance Estimation. Infact, most of the methods (except the denoising and detoning ones) in the class are wrappers of sklearn’s covariance methods but with the added functionality of working specifically with financial data e.g. having inbuilt returns calculation.
We will import the required libraries and the dataset – a small subset of historical prices of ETFs.
# importing our required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import mlfinlab as ml
# reading in our data
stock_prices = pd.read_csv(‘stock_prices.csv’, parse_dates=True, index_col=’Date’)
stock_prices = stock_prices.dropna(axis=1)
For the purpose of this post, we will only use 5 assets, so the differences between the calculated covariance matrices are easy to see.
stock_prices = stock_prices.iloc[:, :5]
Minimum Covariance Determinant
The Minimum Covariance Determinant (MCD) is a robust estimator of covariance that was introduced by P.J. Rousseeuw. From the scikit-learn User Guide on Covariance Estimation, “the basic idea of the algorithm is to find a set of observations that are not outliers and compute their empirical covariance matrix, which is then rescaled to compensate for the performed selection of observations”. The MlFinLab implementation is a wrap around sklearn’s MinCovDet class, which uses FastMCD algorithm, developed by Rousseeuw and Van Driessen. A detailed description of the algorithm is available in the paper by Mia Hubert and Michiel Debruyne – Minimum Covariance Determinant.
First, we can construct a simple (empirical) covariance matrix for comparison.# A class with function to calculate returns from pricesreturns_estimation = ml.portfolio_optimization.ReturnsEstimators()# Calcualting the data set of returnsstock_returns = returns_estimation.calculate_returns(stock_prices)# Finding the simple covariance matrix from a series of returnscov_matrix = stock_returns.cov()
We can now construct the MCD matrix. One can use
minimum_covariance_determinant() method for this purpose.
# A class that has the Minimum Covariance Determinant estimator
risk_estimators = ml.portfolio_optimization.RiskEstimators()
# Finding the Minimum Covariance Determinant estimator on price data and with set random seed to 0
min_cov_det = risk_estimators.minimum_covariance_determinant(stock_prices, price_data=True, random_state=0)
# Transforming our estimation from a np.array to pd.DataFrame
min_cov_det = pd.DataFrame(min_cov_det, index=cov_matrix.index, columns=cov_matrix.columns)
From the above images, you can see that the absolute values in the Minimum Covariance Determinant estimator are lower in comparison to the simple Covariance matrix, which means that the algorithm has eliminated some of the outliers in the data and the resulting covariance matrix estimator is a more robust one. Note that this method achieves the best result when the data has outliers in it.
Maximum Likelihood Covariance Estimator (Empirical Covariance)
Maximum Likelihood Estimator of a sample is an unbiased estimator of the corresponding population’s covariance matrix. This estimation works well when the number of observations is big enough in relation to the number of features.
We can implement this algorithm through the empirical_covariance method() in the MlFinLab library.
# Finding the Empirical Covariance on price data
empirical_cov = risk_estimators.empirical_covariance(stock_prices, price_data=True)
# Transforming Empirical Covariance from np.array to pd.DataFrame
empirical_cov = pd.DataFrame(empirical_cov, index=cov_matrix.index, columns=cov_matrix.columns)
One can observe that the Empirical Covariance is the same as the standard covariance function from the pandas package i.e. the
Visit Hudson and Thames Quantitative Research website to read the full article and download practical code:
Disclosure: Interactive Brokers
Information posted on IBKR Traders’ Insight that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Traders’ Insight are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from Hudson and Thames Quantitative Research and is being posted with permission from Hudson and Thames Quantitative Research. The views expressed in this material are solely those of the author and/or Hudson and Thames Quantitative Research and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
In accordance with EU regulation: The statements in this document shall not be considered as an objective or independent explanation of the matters. Please note that this document (a) has not been prepared in accordance with legal requirements designed to promote the independence of investment research, and (b) is not subject to any prohibition on dealing ahead of the dissemination or publication of investment research.
Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations.