Summary
Director of Site Reliability Engineering at Doctolib, a leading European healthtech company. Requires 12+ years of experience, including 5+ years leading managers, with expertise in SRE, cloud infrastructure, database operations, and network infrastructure. This role will shape the company's multi-cloud strategy and build a proactive reliability culture.
- Location
- Berlin
- Type
- full-time
- Level
- Leadership
- Work mode
- hybrid
Why this role
As our Director of Site Reliability Engineering, reporting to our VP of Platform Engineering, you'll own the core infrastructure layers that everything at Doctolib runs on: cloud infrastructure, database operations, network infrastructure, and observability. You will also lead the Doctolib Operations Center (DOC) and drive a decisive shift from reactive operations to a proactive, world-class reliability culture.
This is a rare opportunity to shape the infrastructure backbone of Europe's leading healthtech company, at a moment when Doctolib is actively expanding multi-cloud capabilities, scaling to new countries, and building the reliability culture that will define the next decade of healthcare innovation.
Why this is an extraordinary challenge
-
Real stakes, every day. When Doctolib is down, consultations don't happen, diagnoses are delayed, care journeys are interrupted. The infrastructure you build is a direct lever on patient outcomes — in a world where 8 of the top 10 causes of death in Europe are preventable.
-
A once-in-a-generation platform transition. Multi-cloud, monolith modularisation, international expansion — all happening simultaneously. You won't inherit a finished platform. You'll define what it becomes.
-
Reliability as the competitive moat. As we scale AI health companions, automate clinical workflows, and launch across Europe, the speed and resilience of the platform directly determines how fast 700+ engineers can ship innovations that change healthcare.
-
A cultural build, not just a technical one. The incident response culture, observability standards, and operational ownership model you establish here will shape how Doctolib engineers work for years to come.
What you'll do
- Build and run a world-class SRE org of 25+ engineers across Cloud Infrastructure, Database & Storage, Network Infrastructure, Observability Tooling, and the Doctolib Operations Center
- Own the infrastructure strategy and roadmap — cloud, database, network, observability — and deliver against company OKRs
- Lead the Doctolib Operations Center: set incident response standards, drive MTTR reduction, embed blameless post-mortem culture across engineering
- Architect and execute our multi-cloud strategy — reducing vendor lock-in, cutting migration costs, and enabling international expansion
- Own network infrastructure at scale: load balancing, CDN/WAF, VPCs, peering, zero-trust networking across a high-traffic, multi-country platform
- Drive observability as a product — give 700+ engineers true visibility into system health and turn observability maturity into an operational excellence lever
- Lead from the front as a senior technical voice in the Platform org and broader Tech leadership team
Who you are
- 12+ years in software engineering, including 5+ years leading managers and running infrastructure or SRE organisations at scale
- Track record of taking SRE practices from reactive to proactive — with measurable reductions in incidents and MTTR
- Strong multi-cloud and network infrastructure experience: load balancing, CDN/WAF, VPCs, peering, at high-traffic scale
- Deep database operations background: large-scale transactional systems (PostgreSQL, Aurora), streaming/CDC (Kafka), data layer FinOps
- Experience building observability platforms that give teams genuine visibility — metrics, logs, traces, alerting
- Sharp process thinking: SLOs, error budgets, incident management, blameless post-mortems
- Outcome-driven: you track reliability, cost efficiency, and engineering velocity as business metrics, not just technical ones
- Strong communicator and influencer at executive level — equally credible with senior engineers and business stakeholders
- Builder of high-performing, people-first engineering cultures
- Fluent in English; comfortable in fast-paced, international environments
- You recognise yourself in our playbook values
Bonus Points If You Have…
- Experience in healthcare, regulated, or high-compliance industries (HDS, ISO 27001, SOC2, GDPR, data sovereignty)
- Familiarity with our stack: Ruby on Rails, Node.js, Go, Python, React, AWS, GCP, Kubernetes, PostgreSQL, Datadog, GitHub Actions
- French language proficiency
- Experience with AI-augmented infrastructure tooling or ML platform operations
- M&A or post-acquisition infrastructure integration experience
What we offer
- A Deutschlandticket (Germany-wide public transport pass) fully paid for by Doctolib
- 28 vacation days + 1 additional day for each full calendar year of employment (up to a maximum of 30 days)
- Work from abroad for up to 10 days per year thanks to our flexibility days policy
- Company health insurance with great supplementary benefits through our partner Allianz
- Company pension scheme (bAV) through Allianz with an employer subsidy
of 40% (15% within the probationary period) - Enrollment in Doctolib's long-term employee value sharing plan called DoctoGrowth
- The Doctolib Parent Care program, which includes one month additional parental leave and much more
- Free mental health and coaching services through our partner Moka.care
- Subsidized sports membership through our partner Urban Sports Club
- A flexible workplace policy offering both hybrid and office-based mode
- Alongside healthy snacks and our regular breakfast buffet, we provide a subsidized meal benefit
- For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
Director, Site Reliability Engineering
Doctolib · Berlin