Loading video player...
Don't miss out! Join us at our next KubeCon + CloudNativeCon events in Mumbai, India (18-19 June, 2026), Yokohama, Japan (29-30 July, 2026), and Shanghai, China (8-9 September, 2026). Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io Tutorial: AI on Kubernetes Without the Chaos: Building Reproducible ML Environments with Argo and Kubeflow - Nourhan Mohamed, KodeKloud A 2023 study found that data leakage compromised reproducibility in nearly 300 ML papers — showing how fragile machine learning remains. On Kubernetes, this fragility often becomes chaos: version drift, broken pipelines, and the “it worked on my laptop” problem make reproducibility a daily challenge. In this tutorial, we’ll walk through a practical, open-source blueprint for building reproducible ML environments with Kubeflow, Argo and ML Metadata. You’ll learn how to design modular workflows that can be rerun consistently, track experiments and dataset lineage, and apply GitOps principles to make pipelines auditable. We’ll also cover techniques like image pinning, artifact caching, and environment snapshots — plus strategies to avoid pitfalls like dependency drift and GPU scheduling conflicts. By the end, you’ll have a clear framework to improve reliability and repeatability in ML workflows on Kubernetes.