Study Notes of Udacity A/B Testing

Study note of Udacity A/B Testing course. ...

July 3, 2020 · 35 min · 7355 words · Me

Spark SQL & DataFrame, SparkETL

Pyspark code of Big Data Essentials: HDFS, MapReduce and Spark RDD ...

July 9, 2019 · 3 min · 484 words · Me

Study Note: Assessing Model Accuracy

MSE/Bias-Variance Trade-Off/K-Nearest Neighbors ...

June 17, 2019 · 5 min · 877 words · Me

Study Note: Clustering

K-Means Clustering/Hierarchical Clustering Algorithm ...

June 15, 2019 · 5 min · 1036 words · Me

Study Note: Dimension Reduction - PCA, PCR

Dimension Reduction Methods Subset selection and shrinkage methods all use the original predictors, X1,X2, . . . , Xp. Dimension Reduction Methods transform the predictors and then fit a least squares model using the transformed variables. Approach Let $Z_1,Z_2, . . . ,Z_M$ represent $M < p$ linear combinations of our original $p$ predictors. That is, $$ \begin{align} Z_m=\sum_{j=1}^p\phi_{jm}X_j \end{align} $$ ...

June 14, 2019 · 11 min · 2235 words · Me