Modeling the 2018 PISA Dataset based on PCA and Clustering A series of interactive tutorials introducing principle component analysis, clustering, linear modelling and cross-validation for large datasets.