teaching | Machine Learning & Atmospheric Processes Group

Below is a growing set of interactive Google Colab notebooks that accompany my courses, workshops, and papers. Each one is self-contained. Just click Open in Colab, then Runtime -> Run all to launch the activity in your browser (no other packages needed!).

1. Model Selection (AMS tutorial)

This short course is about making clear, defensible choices between models for supervised learning. We make use of an Occam’s razor-style approach to model selection, starting with simple linear models and working up to more sophisciated architecutres depending on the problem context.

2. Interpretability (AMS tutorial)

This short course opens a two part series on model interpretability and explainability. Here, we train neural networks of increasing size on real data to show how interpretation becomes harder as complexity grows. We will provide resources explaining where current methods help, where they fall short, and why.

3. Top-of-Atmosphere Brightness Temperatures (clear-sky RT model)

Built on the Rosenkranz & Kummerow (2003) Fortran clearsky routine, this notebook estimates TOA microwave brightness temperatures from surface emissivity plus atmospheric profiles. A 5 000-row sample dataset (2.9 MB) is included for instant testing, and the predicted Tb columns are ready to compare with GMI 1C products.

4. GPM GMI Precipitation Retrieval Sandbox

Explore GMI swath data, visualise multi-frequency brightness temperatures, and build a toy ML precipitation retrieval. Comes with one day of example observations pulled from NASA’s public archive.

5. NetCDF Climate Data & ML Snowmelt Regression

Step-by-step workflow for downloading ERA5-Land monthly means, preprocessing with xarray/pandas, and training a regression model that predicts snowmelt from climate variables. Tutorial slides are available here.

6. Bias-Corrected Climate Prediction API Demo

Live example using pre-trained temperature and precipitation models (Keras + scikit-learn). Shows how to query our bias-correction API, plot accuracy over time/space, and evaluate against ground-truth labels if available.

7. Sentinel-2 & NRCan Land-Cover Classification (RF & MLP)

Quick demo of off-the-shelf random forest and multilayer perceptron classifiers applied to Sentinel-2 imagery and NRCan reference data gathered in earlier recipes. Baseline performance is modest, but the notebook outlines pathways for tuning and optimisation. This was produced in part for an NRCan-promoted ML course for early career scientists.

8. Precipitation Imaging Package (PIP) Microphysics API

Hands-on guide to the PIP dataset (doi:10.7302/37yx-9q53). Load NetCDF files into xarray, plot site maps, curve-fit PSD parameters, and compare rain vs snow—all through a simple Python interface.

9. UMAP Reproducibility Workflow

Recreates the nonlinear dimensional-reduction analysis from our phase-classification study using a 10 % subsample (automatically downloaded). Includes helper functions for visualisation and clustering so you can extend the work or swap in your own data.

10. Interpretable Snowfall-Rate Regressor (Toy Models)

Reproduces the toy NN + sparse-autoencoder pipeline from “Towards Interpretable Machine Learning in the Geosciences.” Runs in ~8 min on a Colab T4 GPU. Also available in a classification flavour for you to check out!

• Classification version

More notebooks are on the way, so stay tuned!