Tech Notes of Yi Wang: A Nice Introduction to Logistic Regression

http://luna.cas.usf.edu/~mbrannic/files/regression/Logistic.html

Among the many text books and tutorials on logistic regression, the very preliminary one given by above link explains how the logistic regression model comes:

In the binary classification problem, it is intuitive to determine whether an instance x belongs to class 0 or class 1 by the ratio P(c=1|x) / P(c=0|x). Denoting P = P(c=1|x) and 1-P = P(c=0|x), the ratio becomes odds P/(1-P).

However, a bad property of odds is that it is asymmetric w.r.t. P. For example, swapping the values of P and 1-P does not negates the value of P/(1-P). However, the swapping does negates the logit ln P/(1-P). So, it becomes reasonable to make logit instead of odds our dependent variable.

By modeling the dependent variable by a linear form, we get:

ln P/(1-P) = a + bx

which is equivalent to

P = e^a+bx / (1 + e^a+bx)

Above tutorial also compares linear regression with logistic regression:

"If you use linear regression, the predicted values will become greater than one and less than zero if you move far enough on the X-axis. Such values are theoretically inadmissible."

This explains that logistic regression does not estimate the relation between x and c, instead it estimates x and P(c|x), and uses P(c|x) to determine whether x is in c=1 or c=0. So logistic regression is not regression, it is a classifier.

Additional information:

A C++ implementation of large-scale logistic regression (together with a tech-report) can be found at:
http://stat.rutgers.edu/~madigan/BBR
A Mahout slides show that they have received a proposal to implement logistic regression in Hadoop from Google Summer school of Code, but I have not seen the result yet.
Two papers on large-scale logistic regression was published in 2009:
1. Parallel Large-scale Feature Selection for Logistic Regression, and
2. Large-scale Sparse Logistic Regression

Tech Notes of Yi Wang

Dec 20, 2009

A Nice Introduction to Logistic Regression

No comments:

About Me

Blog Archive

Followers