Trustworthy AI

We develop artificial intelligence systems that are reliable, transparent, and fair, helping users understand and trust the models they use.

As machine learning becomes embedded in operational system across the geosciences the question of trust becomes critical. Despite their impressive accuracy, black-box models often lack transparency, leaving users in the dark about how predictions are made or where they might fail. These hidden biases can lead to serious real-world consequences. In other words, do we really know what our models are doing?

Our work focuses on building interpretable machine learning systems for geoscience applications. In a recent study (under review at JGR: Machine Learning and Computation), we explore the internal structure of neural networks trained to classify precipitation types, revealing interpretable “circuits” that resemble known physical processes. We show that these circuits remain stable across initializations and can be traced to specific phase-relevant features like temperature thresholds and moisture layers.

Operational ML systems are used for forecasting floods, monitoring food security, and predicting wildfire risk, but the lack of interpretability poses ongoing challenges.
We use sparse autoencoders to analyze the internal representations of neural networks trained to classify precipitation phase.

We find that key neurons emerge with well-defined physical meanings, encoding features like melting layer structure or cloud-top height. These insights allow domain experts to interrogate the model and build confidence in its outputs, or identify conditions where the model may be extrapolating dangerously.

Some hidden units form "circuits" with consistent physical interpretation across runs. This opens new doors for trust and explainability in atmospheric ML.

Our goal is to create tools that make machine learning models more interpretable and accountable so that as these systems scale into critical domains, the people who rely on them can understand how and why decisions are being made.


Related Publications

2025

  1. JGR: ML&C
    pub15.gif
    Leveraging Sparse Autoencoders to Reveal Interpretable Features in Geophysical Models
    Fraser King, Claire Pettersen, Derek Posselt, and 2 more authors
    Journal of Geophysical Research: Machine Learning and Computation, 2025