Research School for Socio-Economic and
Natural Sciences of the Environment
Research School for Socio-Economic and
Natural Sciences of the Environment
Agenda

Chemometrics (Multivariate Statistics)

Date: 13 October 2020 - 15 October 2020
Location: Wageningen

Target group/prerequisites

Participants are expected to have knowledge of basic statistics, e.g. hypothesis testing, correlation and linear regression, and experience using R and RStudio.

Course design

Each day consists of lectures in the morning and practicals using R in the afternoon.

Program topics

Day 1: Data pre-treatment, PCA and PCR Discussion of different data pre-treatment methods e.g. centering, autoscaling, pareto scaling and range scaling. Data exploration using Principal Component Analysis (PCA) and regression using the principal components from PCA in Principal Component Regression, PCR.

Day 2: Modern regression techniques and model validation Discussion of regression methods for high dimensional data: Partial Least Squares (PLS, a technique similar to PCR but with improvements) and regularized regression (ridge/lasso). Ways of assessing model accuracy will also be discussed.

Day 3: Clustering and classification; k-means, hierarchical clustering, LDA and PLS-DA Discussion of cluster analysis: choice of similarity measure, agglomerative methods, divisive methods, k-means & hierarchical clustering.

Intended credits1.5 ECTS
Course organisation   VLAG Graduate School     
More informationCourse website