Understanding Logistic Regression

This post explains the logistic regression and implements R code for the estimation of its parameters.


Logistic Regression is a benchmark machine learning model. This model have a binary response variable (Y) which takes on 0 or 1. We can find lots of this kind of variables, among them are success/failure, bankruptcy/solvency, exit/stay. To understand logistic regression model, we should know PMF(probability mass function) of the binomial distribution since its log likelihood function is constructed using this PMF. Given this log likelihood function, we can estimate its parameters by maximizing the log likelihood function by MLE (maximum likelihood estimation).

PMF for the binomial distribution

Given success probability p and the number of trials n, binomial distribution produces the probability of occurring x(≤n) success as follows.

From the above equation for P(X=x;n,p),   denotes the case when the number of success is x from n Bernoulli trials. Let’s take an example: suppose n = 3, x = 1.

denotes the probability of . This probability of the above example (n = 3, x = 1) is as follows.

It is worth noticing that since P(X=x;n,p) have one p, the symbol of combination is used. This logic is similar to that of calculating one weighted average.

Standard linear regression is used for the calculation of the conditional mean, which is varying with the realization of covariates (X). This means that in addition to the unconditional mean, additional information from covariates are useful for the improvement of forecast performance. This story is also applied to logistic regression. Assuming p is neither a fixed nor an unconditional probability, we can estimate the conditional probability using explanatory variables (X).

Logit transformation

Since explanatory variables X have useful information for explaining variation of a binary variable Y, we can set up a linear regression model for this binary response variable. But output from a standard linear regression can be of less than zero or more than one. This is, therefore, not appropriate because Y is the binary variable which have a range from 0 and 1. From this reason, we need the following logit transformation which converts 0~1 binary variable to −∞~∞ real number.

Using this transformation, we can obtain the logistic regression model in the following way.

Log-likelihood function

Since each  is different with covariates, likelihood function of logistic regression as successive Bernoulli trials, has the following form.

We can find that there is no term like in the likelihood function. The reason is that is different with covariates as mentioned previously. Combination is used only when occurrence probability are fixed and all the same. In particular, to facilitate a numerical optimization, we use the following log likelihood function.

It is typical to estimate parameters using MLE by calling numerical optimization. But for ready-made models like linear regression, logistic regression, etc, it is convenient to use glm() R package which performs optimization automatically.

R code for Logistic Regression

The following R code shows how to implement logistic regression using R. We provide two-method: the one uses glm() conveniently and the other maximize log likelihood function by numerical MLE directly. To check for the accuracy of estimation, we assume population parameters as . It is, therefore, expected that two methods produce the same results.

# Financial Econometrics & Derivatives, ML/DL using R, Python, Tensorflow  
# by Sang-Heon Lee 
# Logistic Regression
graphics.off()  # clear all graphs
rm(list = ls()) # remove all files from your workspace
# Simulated data
    # number of observation
    nobs <- 10000
    # parameters assumed
    beta <- c(0.3, 0.5, -0.2)
    # simulated x
    x1 <- rnorm(n = nobs)
    x2 <- rnorm(n = nobs)
    bx = beta[1] + beta[2]*x1 + beta[3]*x2
    p = 1 / (1 + exp(-bx))
    # simulated y
    y <- rbinom(n = nobs, size = 1, prob = p)
    # simulated y and x
    df.yx <- data.frame(y, x1, x2)      
# Estimation by glm
    fit_glm = glm(y ~ ., data = df.yx, family = binomial)
# Estimation by optimization
    # objective function = negative log-likelihood
    obj.func <- function(b0, df.yx) {
        beta <- b0
        y <- df.yx[,1]
        x <- as.matrix(cbind(1,df.yx[,-1]))
        bx = x%*%beta
        p = 1 / (1 + exp(-bx)) 
        # we can calculate log-likelihood
        # using one of the two below
        # 1. vector operation
        loglike <- sum(y*log(p) + (1-y)*log(1-p))
        # 2. loop operation
        # loglike <- 0
        # for(i in 1:nrow(df.yx)) {
        #    loglike <- loglike + y[i]*log(p[i]) + 
        #               (1-y[i])*log(1-p[i])
        # }
    # run optimization
    m<-optim(c(0.1,0.1,0.1), obj.func,
             control = list(maxit=5000, trace=2),
# Comparison of results

The above R program delivers the parameter estimates from two approaches.

From the estimation output, we can find that two approaches show the same results. Numerical optimziation for MLE is used only for educational purpose. It is, therefore, recommended to use glm() function.

For additional insight on this topic and to download the R script, visit https://kiandlee.blogspot.com/2021/05/understanding-logistic-regression.html.

Disclosure: Interactive Brokers

Information posted on IBKR Traders’ Insight that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Traders’ Insight are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from SH Fintech Modeling and is being posted with permission from SH Fintech Modeling. The views expressed in this material are solely those of the author and/or SH Fintech Modeling and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

In accordance with EU regulation: The statements in this document shall not be considered as an objective or independent explanation of the matters. Please note that this document (a) has not been prepared in accordance with legal requirements designed to promote the independence of investment research, and (b) is not subject to any prohibition on dealing ahead of the dissemination or publication of investment research.

Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations.