What You’ll Do
As a Lead MLOps Engineer, you’ll own the design and evolution of our ML infrastructure—enabling fast, reliable, and secure experimentation, deployment, and monitoring of AI agents and LLMs in production. You’ll guide a small but high-impact team of DevOps and ML engineers, ensuring our platform achieves best-in-class reliability, scalability, and velocity.
You’ll work cross-functionally with data scientists, AI engineers, and product teams to bring the next generation of AI workflows—from fine-tuning to agent orchestration—to life.
You’ll work cross-functionally with data scientists, AI engineers, and product teams to bring the next generation of AI workflows—from fine-tuning to agent orchestration—to life.
- Architect and evolve InteractiveAI’s ML infrastructure, from data ingestion to model serving and continuous learning loops
- Design and implement scalable, cloud-agnostic runtimes (Kubernetes/GPU clusters) across on-prem, VPC, and hybrid deployments
- Build automation for end-to-end ML pipelines (data → fine-tuning → evaluation → deployment)
- Establish gold standards for reproducibility, observability, and model governance
- Partner with AI Engineers to optimize training/inference performance and cost
- Build internal tooling to accelerate AI product delivery and reduce time-to-deploy
- Implement robust monitoring, logging, and alerting frameworks for ML workloads
- Drive adoption of CI/CD best practices for ML and infrastructure code
- Mentor and grow a small team of MLOps engineers, fostering technical excellence and ownership