Tuesday, Feb 17, 2026 · 6:30 PM – 8:00 PM

Machine Learning Systems in Production, From Model Quality to Operational Stability

Matteo Ricci

Senior Machine Learning Systems Engineer

About the talk

This talk explored the gap between building accurate models and operating machine learning systems that remain reliable in production. The session focused on the practical engineering challenges that appear after deployment, when models begin interacting with live data, real users, and changing product conditions.

The discussion covered monitoring, drift detection, feedback loops, retraining strategy, failure modes, and the platform level decisions that affect long term system stability. It also addressed why strong offline performance alone is often a weak predictor of production success.

By grounding the discussion in real delivery challenges, the session highlighted the engineering discipline needed to make machine learning dependable at scale.

This talk covered

• Production risks beyond offline model quality
• Monitoring and drift management
• Feedback loops and retraining strategy
• Reliability concerns in live ML systems
• Platform decisions that affect ML stability

About the speaker

Matteo Ricci is an ML systems engineer with experience across model deployment, platform reliability, and production monitoring. She works on making machine learning systems more stable, observable, and maintainable in real product environments.

Past events

Feb 13, 2026

Rethinking Performance Engineering Around INP, Responsiveness, and User Perceived Speed

Jan 23, 2026

Platform Engineering as a Force Multiplier for Product and Infrastructure Teams