Lead MLOps Engineer

Permanent employee, Full-time · Madrid, Lisbon

60,000 - 100,000 € per year

What You’ll Do

As a Lead MLOps Engineer, you’ll own the design and evolution of our ML infrastructure—enabling fast, reliable, and secure experimentation, deployment, and monitoring of AI agents and LLMs in production. You’ll guide a small but high-impact team of DevOps and ML engineers, ensuring our platform achieves best-in-class reliability, scalability, and velocity.
You’ll work cross-functionally with data scientists, AI engineers, and product teams to bring the next generation of AI workflows—from fine-tuning to agent orchestration—to life.

Architect and evolve InteractiveAI’s ML infrastructure, from data ingestion to model serving and continuous learning loops
Design and implement scalable, cloud-agnostic runtimes (Kubernetes/GPU clusters) across on-prem, VPC, and hybrid deployments
Build automation for end-to-end ML pipelines (data → fine-tuning → evaluation → deployment)
Establish gold standards for reproducibility, observability, and model governance
Partner with AI Engineers to optimize training/inference performance and cost
Build internal tooling to accelerate AI product delivery and reduce time-to-deploy
Implement robust monitoring, logging, and alerting frameworks for ML workloads
Drive adoption of CI/CD best practices for ML and infrastructure code
Mentor and grow a small team of MLOps engineers, fostering technical excellence and ownership

What We’re Looking For

We’re seeking a hands-on technical leader who combines deep MLOps expertise with a builder’s mindset—someone who thrives in fast-moving environments and can scale both systems and teams.

Minimum Requirements:

5+ years of experience in DevOps, MLOps, or Infrastructure Engineering roles
Proven track record deploying and maintaining ML workloads in production
Strong expertise in containerization and orchestration (Docker, Kubernetes)
Experience building CI/CD pipelines for ML models and infrastructure
Proficiency with infrastructure-as-code tools (Terraform, Pulumi, CloudFormation)
Strong coding/scripting skills (Python, Bash, or similar)
Experience with monitoring and observability tools (Prometheus, Grafana, ELK, etc.)
Experience with at least one major cloud provider (AWS, GCP, or Azure)
Strong understanding of ML lifecycle management (training, evaluation, deployment, monitoring)

Additional Requirements:

Experience with MLflow, Weights & Biases, or other model-tracking systems
Understanding of fine-tuning workflows (LoRA, QLoRA, PEFT) and LLM serving
Exposure to RAG systems, vector databases, and large-model inference optimization
Experience implementing security and compliance practices (GDPR, ISO 27001, etc.)
Prior experience leading technical teams or mentoring engineers
Familiarity with distributed training and GPU cluster management is a plus

What You’ll Get

Competitive base salary (from €60,000/yr to €100,000/yr) + performance bonuses
Future equity opportunity for high performers
Health & wellness allowances
Private health insurance
Flexible work setup + travel when needed (ideally Hybrid in Lisbon or Madrid)
25 days of holidays/paid time off (excluding local public holidays)

Who You Are

Proactive & Strategic: You anticipate system and organizational needs, designing scalable and future-proof solutions.
Technical Leader: You raise the bar for engineering excellence and help others do their best work.
Accountable & High-Ownership: You take full responsibility for uptime, performance, and delivery.
Builder Mentality: You’re comfortable in ambiguity, moving fast while maintaining reliability.
Collaborative Partner: You communicate clearly, build trust across teams, and balance pragmatism with long-term vision.

Interview Process

We keep our process focused and respectful of your time. Most candidates complete it in 2–3 weeks. Here’s what to expect:

Intro Call – 30 minutes with our team to align on fit and expectations
Technical Challenge – A practical MLOps design or automation task
Technical Interview – Deep dive into systems architecture, automation, and ML infrastructure
Leadership & Values Interview – Assess alignment with InteractiveAI’s culture and growth mindset
Offer – Final conversation and offer

We’re building a team of builders — people who care about impact, quality, and growth.
If that’s you, let’s talk — careers@interactive.ai

Apply for this job

About us

InteractiveAI is a fast-growing startup on a mission to empower enterprises with fully managed AI agent lifecycles.
We are building the next generation of enterprise-AI solutions, delivering an end-to-end Agentic IDE alongside an extensible ecosystem of agentic resources and solutions.

Our platform allows companies to orchestrate, monitor, evaluate, deploy and improve AI agents—and soon fine-tune and own their own models.

We value autonomy, speed, and innovation, and we’re building a world-class team to match. Our squads are lean, focused, and execution-driven.

If you thrive in high-performance environments and want to be part of a company that rewards transformational outcomes, this is for you.

Apply for this job

We are looking forward to hearing from you!

Thank you for your interest in InteractiveAI. Please fill out the following short form. Should you have difficulties with the upload of your data, please send an email to careers@interactive.ai