Study Note: SVM

Maximal Margin Classifier What Is a Hyperplane? Hyperplane: In a p-dimensional space, a hyperplane is a flat affine subspace of dimension $p − 1$. e.g. in two dimensions, a hyperplane is a flat one-dimensional subspace—in other words, a line. Mathematical definition of a hyperplane: $$ \beta_0+\beta_1X_1+\beta_2X_2,…+\beta_pX_p=0, \quad (9.1) $$ Any $X = (X_1,X_2,…X_p)^T$ for which (9.1) holds is a point on the hyperplane. ...

June 12, 2019 · 8 min · 1541 words · Me

Study Note: Resampling Methods - Cross Validation, Bootstrap

Resampling methods:involve repeatedly drawing samples from a training set and refitting a mode of interest on each sample in order to obtain additional information about the fitted model. model assessment:the process of evaluating a model’s performance model selection:the process of selecting the proper level of flexibility for a model cross-validation: can be used to estimate the test error associated with a given statistical learning method in order to evaluate its performance, or to select the appropriate level of flexibility. bootstrap:provide a measure of accuracy of a parameter estimate or of a given selection statistical learning method. ...

June 12, 2019 · 4 min · 819 words · Me

Study Note: Model Selection and Regularization (Ridge & Lasso)

Subset Selection/Adjusted $R^2$/Ridge/Lasso/SVD ...

June 11, 2019 · 18 min · 3817 words · Me

Study Note: Comparing Logistic Regression, LDA, QDA, and KNN

Logistic regression and LDA methods are closely connected. Setting: Consider the two-class setting with \(p = 1\) predictor, and let \(p_1(x)\) and \(p_2(x) = 1−p_1(x)\) be the probabilities that the observation \(X = x\) belongs to class 1 and class 2, respectively. In LDA, from $$ \begin{align} p_k(x)=\frac{\pi_k \frac{1}{\sqrt{2\pi}\sigma}\exp{\left( -\frac{1}{2\sigma^2}(x-\mu_k)^2 \right)}}{\sum_{l=1}^K\pi_l\frac{1}{\sqrt{2\pi}\sigma}\exp{\left( -\frac{1}{2\sigma^2}(x-\mu_l)^2 \right)}} \end{align} $$ $$ \begin{align} \delta_k(x)=x\frac{\mu_k}{\sigma^2}-\frac{\mu_k^2}{2\sigma^2}+\log(\pi_k) \end{align} $$ The log odds is given by $$ \begin{align}\log{\frac{p_1(x)}{1-p_1(x)}}=\log{\frac{p_1(x)}{p_2(x)}}=c_0+c_1x \end{align} $$ where c0 and c1 are functions of μ1, μ2, and σ2. In Logistic Regression, $$ \begin{align} \log{\frac{p_1}{1-p_1}}=\beta_0+\beta_1x \end{align} $$ ...

June 10, 2019 · 4 min · 851 words · Me

Study Note: Linear Discriminant Analysis, ROC & AUC, Confusion Matrix

LDA V.S. Logistic Regression: When the classes are well-separated, the parameter estimates for the logistic regression model are surprisingly unstable. Linear discriminant analysis does not suffer from this problem. If n is small and the distribution of the predictors X is approximately normal in each of the classes, the linear discriminant model is again more stable than the logistic regression model. Linear discriminant analysis is popular when we have more than two response classes. ...

June 9, 2019 · 10 min · 1968 words · Me