Compare prediction approaches in classification and regression problems
This post summarizes Page 43 (classification) and Page 47 of PRML.
Classification
Generative model
- predict joint probability density $p(\mathbf{x}, \mathcal{C}_k)$,
- then calculate the posterior $p(\mathcal{C}_k \vert \mathbf{x})$,
- then make a decision.
Example methods: linear discriminant analysis (LDA), quadratic discriminant analysis (QDA).
Discriminative model
- predict the posterior $p(\mathcal{C}_k \vert \mathbf{x})$,
- then make a decision
Example methods: logistics regression.
Discriminant function (model)
- predict the decision (class label) directly without probability playing a role.
Example methods: Least squares for classification, Fisher’s linear discriminant, perceptron algorithm.
Regression model
Assuming a mean squared loss is used
\[E[L]= \iint \{y(\mathbf{x}) − t\}^2 p(\mathbf{x}, t) d\mathbf{x} dt\]then, when the loss is minimized
\[y(\mathbf{x})^* = \arg \min_{y(\mathbf{x})} \mathbb{E}[L] = E[t|\mathbf{x}]\]i.e. the expectation conditional expectation of $t$ given the feature vector $\mathbf{x}$.
Approach 1 (similar to the generative model in classification)
- predict join probability density $p(\mathbf{x}, t)$,
- then calculate the posterior $p(t \vert \mathbf{x})$,
- then calculate $E[t \vert \mathbf{x}]$
Approach 2 (similar to the discriminative model in classification)
- predict the posterior $p(t \vert \mathbf{x})$,
- then calculate $E[t \vert \mathbf{x}]$
Approach 3 (similar to the discriminant function in classification)
- predict $E[t \vert \mathbf{x}]$ directly without probability playing a role