Videos

Technical Videos

Walkthroughs on scientific AI reproducibility, GPU internals, containerized ML workflows, and HPC infrastructure. Built for practitioners who want the mechanism explained, not just the headline.

Why Containers Don't Guarantee AI Reproducibility thumbnail

Scientific ML · CUDA · Containers

Why Containers Don't Guarantee AI Reproducibility

CUDA parallelizes operations across thousands of GPU threads, and the order those threads accumulate values isn't fixed. In floating-point arithmetic, order matters. This video walks through how that breaks scientific reproducibility inside identical containers.

CUDA Floating-Point Containers Reproducibility Scientific ML

May 2026 Watch on YouTube →

More GPUs, Slower AI thumbnail

HPC · Distributed Training · GPU Scaling

More GPUs, Slower AI?

The assumption that adding GPUs accelerates training is deeply embedded in how teams think about AI infrastructure. This video walks through what happens when that assumption breaks: communication overhead, gradient reduction complexity, and how scaling hardware can actively degrade convergence.

Distributed Training GPU Scaling HPC Performance Reproducibility

June 2026 Watch on YouTube →

More videos in progress

Upcoming topics include cuDNN algorithm selection, physics-informed model validation, and distributed training reproducibility on HPC clusters.