Pairs Trading Basics: Correlation, Cointegration And Strategy – Part I


Visit: QuantInsti

A pairs trading strategy is one of the most popular strategies when it comes to finding trading opportunities between the two stocks that are co-integrated.

How do the stocks co-integrate? How to take advantage of their co-integration with a pairs trading strategy? This blog discusses it all as it covers:

  • What is pairs trading?
  • History of pairs trading
  • What is the logic behind pairs trading?
  • Essential terms used in pairs trading
  • Correlation
  • Cointegration
  • Z-score
  • Augmented Dickey Fuller Test
  • Steps for pairs trading
  • Select stocks for pairs trading
  • Entry points
  • Defining exit points
  • Pairs trading strategy using Excel and Python
  • Advantages of pairs trading
  • Disadvantages of pairs trading

What is pairs trading?

In a pairs trading strategy, usually, a pair of stocks is traded in a market-neutral strategy, i.e. it doesn’t matter whether the market is trending upwards or downwards, the two open positions for each stock hedge against each other. The key challenges in pairs trading are to:

  • Select a pair which will give you good statistical arbitrage opportunities over time
  • Select the entry/exit points

History of pairs trading

Pairs trading was first introduced in the mid-1980s by a group of technical analyst researchers that were employed by Morgan Stanley. The pairs trading strategy uses statistical and technical analysis to seek out potential market-neutral profits.

What is the logic behind pairs trading?

In the case of a pairs trading strategy, the two stocks or the financial instruments need to be trending at a similar mean price and remain close to each other. But, on certain occasions, one of the instruments may go through a short period of deviation from another in terms of price.

In this short period, the trader can take the opportunity to go long on one of the financial instruments while shorting the other. The positions are based on the current market price of both the stocks and their standard deviation.

Essential terms used in pairs trading

Some of the essential terms that are used in pairs trading strategy are-


Correlation is quantified by the correlation coefficient ρ, which ranges from -1 to +1. The correlation coefficient indicates the degree of correlation between the two variables.

The value of +1 means there exists a perfect positive correlation between the two variables, -1 means there is a perfect negative correlation and 0 means there is no correlation.

A perfect positive correlation is when one variable moves in either an upward or downward direction and the other variable also moves in the same direction with the same magnitude.

Whereas a perfect negative correlation is when one variable moves in the upward direction and the other variable moves in the downward (i.e. opposite) direction with the same magnitude.

The correlation coefficient for the two variables is given by:

Correlation(X,Y) = ρ = COV(X,Y) / SD(X).SD(Y)

cov (X, Y) = the covariance between X & Y
SD (X) and SD(Y) = the standard deviation of the respective variables

If the correlation is high, say 0.8, traders may choose that pair for pairs trading. This high number represents a strong relationship between the two stocks. So if A goes up, the chances of B going up are also quite high.

Based on this assumption a market neutral strategy is played where A is bought and B is sold; bought and sold decisions are made based on their individual patterns.

Just looking at correlation might give you spurious results. For instance, if your pairs trading strategy is based on the spread between the prices of the two stocks, it is possible that the prices of the two stocks keep on increasing without ever mean-reverting.

Spread = log(a) – nlog(b)

where ‘a’ and ‘b’ = prices of stocks A and B respectively

For each stock of A bought, you have sold n number of stocks of B.

Now, both ‘a’ and ‘b’ increase in such a way that the value of the spread decreases. This will result in a loss since stock A is increasing at a rate lower than stock B and you are short on stock B.

Thus, one should be careful of using only correlation for determining the pairs of the stocks while performing the pairs trading strategy.


​​The most common test for Pairs Trading is the cointegration test. Cointegration is a statistical property of two or more time-series variables which indicates if a linear combination of the variables is stationary.

Let us understand the statement above. The two time series variables, in this case, are the log of prices of stocks A and B. Linear combination of these variables can be a linear equation defining the spread:

As you know,

Spread = log(a) – nlog(b)

where ‘a’ and ‘b’ are prices of stocks A and B respectively.

For each stock of A bought, you have sold n stocks of B.

If A and B are cointegrated, the equation above is stationary. A stationary process has very valuable features which are required to model pairs trading strategies.

For instance, in this case, if the equation above is stationary, that suggests that the mean and variance of this equation remain constant over time.

So if we start with ‘n’, which is called the hedge ratio, so that spread = 0, the property of stationary implies that the expected value of spread will remain as 0. Any deviation from this expected value is a case for statistical abnormality, hence a case for pairs trading!


Given a normal distribution of raw data points, the z-score is calculated so that the new distribution is a normal distribution with a mean of 0 and a standard deviation of 1. Having such a distribution ~ N(0, 1) is very useful for creating threshold levels.

For instance, in pairs trading, we have a distribution of spread between the prices of stocks A and B. We can convert these raw scores of spread into z-scores as explained below.

This new distribution will have a mean of 0 and a standard deviation of 1. It is easy to create threshold levels for this distribution such as 1.5 sigma, 2 sigma, 2.5 sigma, and so on.

The formula for z-score is as follows:

z = (x – mean) / standard deviation

x = a raw data point
z = the z-score

Mean and standard deviation can be rolling statistics for a period of ‘t’ days or minutes or time intervals.

Stay tuned for the next installment in which Chainika Thakar will discuss Augmented Dickey Fuller Test.

Visit QuantInsti to read the full article:

Disclosure: Interactive Brokers

Information posted on IBKR Traders’ Insight that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Traders’ Insight are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from QuantInsti and is being posted with permission from QuantInsti. The views expressed in this material are solely those of the author and/or QuantInsti and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

In accordance with EU regulation: The statements in this document shall not be considered as an objective or independent explanation of the matters. Please note that this document (a) has not been prepared in accordance with legal requirements designed to promote the independence of investment research, and (b) is not subject to any prohibition on dealing ahead of the dissemination or publication of investment research.

Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations.