Lead MLOps Engineer

Permanent employee, Full-time · Madrid, Lisbon
60,000 - 100,000 € per year
What You’ll Do
As a Lead MLOps Engineer, you’ll own the design and evolution of our ML infrastructure—enabling fast, reliable, and secure experimentation, deployment, and monitoring of AI agents and LLMs in production. You’ll guide a small but high-impact team of DevOps and ML engineers, ensuring our platform achieves best-in-class reliability, scalability, and velocity.
You’ll work cross-functionally with data scientists, AI engineers, and product teams to bring the next generation of AI workflows—from fine-tuning to agent orchestration—to life.
  • Architect and evolve InteractiveAI’s ML infrastructure, from data ingestion to model serving and continuous learning loops
  • Design and implement scalable, cloud-agnostic runtimes (Kubernetes/GPU clusters) across on-prem, VPC, and hybrid deployments
  • Build automation for end-to-end ML pipelines (data → fine-tuning → evaluation → deployment)
  • Establish gold standards for reproducibility, observability, and model governance
  • Partner with AI Engineers to optimize training/inference performance and cost
  • Build internal tooling to accelerate AI product delivery and reduce time-to-deploy
  • Implement robust monitoring, logging, and alerting frameworks for ML workloads
  • Drive adoption of CI/CD best practices for ML and infrastructure code
  • Mentor and grow a small team of MLOps engineers, fostering technical excellence and ownership
What We’re Looking For
We’re seeking a hands-on technical leader who combines deep MLOps expertise with a builder’s mindset—someone who thrives in fast-moving environments and can scale both systems and teams.

Minimum Requirements:
  • 5+ years of experience in DevOps, MLOps, or Infrastructure Engineering roles
  • Proven track record deploying and maintaining ML workloads in production
  • Strong expertise in containerization and orchestration (Docker, Kubernetes)
  • Experience building CI/CD pipelines for ML models and infrastructure
  • Proficiency with infrastructure-as-code tools (Terraform, Pulumi, CloudFormation)
  • Strong coding/scripting skills (Python, Bash, or similar)
  • Experience with monitoring and observability tools (Prometheus, Grafana, ELK, etc.)
  • Experience with at least one major cloud provider (AWS, GCP, or Azure)
  • Strong understanding of ML lifecycle management (training, evaluation, deployment, monitoring)
Additional Requirements:
  • Experience with MLflow, Weights & Biases, or other model-tracking systems
  • Understanding of fine-tuning workflows (LoRA, QLoRA, PEFT) and LLM serving
  • Exposure to RAG systems, vector databases, and large-model inference optimization
  • Experience implementing security and compliance practices (GDPR, ISO 27001, etc.)
  • Prior experience leading technical teams or mentoring engineers
  • Familiarity with distributed training and GPU cluster management is a plus
What You’ll Get
  • Competitive base salary (from €60,000/yr to €100,000/yr) + performance bonuses
  • Future equity opportunity for high performers
  • Health & wellness allowances
  • Private health insurance
  • Flexible work setup + travel when needed (ideally Hybrid in Lisbon or Madrid)
  • 25 days of holidays/paid time off (excluding local public holidays)
Who You Are
  • Proactive & Strategic: You anticipate system and organizational needs, designing scalable and future-proof solutions.
  • Technical Leader: You raise the bar for engineering excellence and help others do their best work.
  • Accountable & High-Ownership: You take full responsibility for uptime, performance, and delivery.
  • Builder Mentality: You’re comfortable in ambiguity, moving fast while maintaining reliability.
  • Collaborative Partner: You communicate clearly, build trust across teams, and balance pragmatism with long-term vision.
Interview Process
We keep our process focused and respectful of your time. Most candidates complete it in 2–3 weeks. Here’s what to expect:
  1. Intro Call – 30 minutes with our team to align on fit and expectations
  2. Technical Challenge – A practical MLOps design or automation task
  3. Technical Interview – Deep dive into systems architecture, automation, and ML infrastructure
  4. Leadership & Values Interview – Assess alignment with InteractiveAI’s culture and growth mindset
  5. Offer – Final conversation and offer
We’re building a team of builders — people who care about impact, quality, and growth.
If that’s you, let’s talk — careers@interactive.ai
About us

InteractiveAI is a fast-growing startup on a mission to empower enterprises with fully managed AI agent lifecycles. 
We are building the next generation of enterprise-AI solutions, delivering an end-to-end Agentic IDE alongside an extensible ecosystem of agentic resources and solutions. 

Our platform allows companies to orchestrate, monitor, evaluate, deploy and improve AI agents—and soon fine-tune and own their own models. 

We value autonomy, speed, and innovation, and we’re building a world-class team to match. Our squads are lean, focused, and execution-driven.

If you thrive in high-performance environments and want to be part of a company that rewards transformational outcomes, this is for you. 

We are looking forward to hearing from you!
Thank you for your interest in InteractiveAI. Please fill out the following short form. Should you have difficulties with the upload of your data, please send an email to careers@interactive.ai
Uploading document. Please wait.
Please add all mandatory information with a * to send your application.