cover
Full Time

Senior Site Reliability Engineer/ 1 week ago

Laravel
Attractive
Application ends: 2026-02-06

Quick Summary

This is a foundational, high-impact Senior SRE role at Laravel, responsible for establishing the SRE function and ensuring the reliability and scalability of global infrastructure supporting products like Laravel Cloud, Forge, and Vapor. The role requires deep expertise in AWS, Linux, and multi-region Kubernetes cluster management, coupled with proficiency in Infrastructure as Code using Terraform and programming skills in PHP, Bash, or Go. Key responsibilities include designing advanced observability systems (Prometheus, Grafana, Loki), defining SLOs, and driving significant automation to reduce manual toil. This is a fully remote position.

Senior Site Reliability Engineer (SRE) - Global Infrastructure & Kubernetes

At Laravel, we empower millions of developers. We are seeking a Senior Site Reliability Engineer (SRE) to ensure our global infrastructure is reliable, scalable, and elegant. If you thrive on managing multi-region Kubernetes clusters, building robust observability systems, and solving complex operational challenges through code, join us to build the foundation for Laravel Cloud, Nightwatch, Forge, and Vapor.

Role Overview:

As a founding member of our dedicated SRE function, reporting directly to Florian Beer, this is a high-impact, autonomous role. You will design and implement critical systems, acting as a bridge between development and operations, fostering a blameless culture and shared responsibility for reliability across the organization.

Your 12-Month Mission Highlights:

  • First 30 Days: Stabilize incident response by creating comprehensive, actionable runbooks for core alerts.
  • Day 60: Pioneer "observability as code" by migrating alert rules and dashboards into version control.
  • Day 90: Establish clear, data-driven SLOs (Service Level Objectives) for all customer-facing products.
  • Year One: Transform system visibility with insightful dashboards and significantly reduce manual toil through sophisticated automation.

Key Responsibilities:

  • Architect Reliability: Establish SRE fundamentals and best practices from the ground up at Laravel.
  • System Design: Design, build, and maintain multi-region Kubernetes infrastructure and global distributed systems.
  • Automation: Solve operational challenges using software, minimizing manual intervention (toil) for product teams.
  • Observability: Design and implement advanced monitoring, logging, and alerting systems using tools like Prometheus, Grafana, and Loki.
  • Collaboration: Partner with product leads and SecOps to ensure reliability is a shared organizational responsibility.
  • Incident Response: Lead incident reviews and postmortems in a strictly blameless environment to foster continuous learning.

Required Skills & Experience:

  • Infrastructure Mastery: Deep experience with Linux system administration and cloud platforms, specifically AWS.
  • Orchestration & IaC: Proficiency with Kubernetes, Docker, and managing infrastructure via Terraform.
  • Programming Skills: Ability to solve problems with software and scripting using PHP, Bash, or Go.
  • Systems Thinking: A passionate approach to troubleshooting, capable of deconstructing complex systems into triagable components.
  • Reliability Mindset: Experience defining and implementing SLO/SLI/SLA, capacity planning, and performance tuning.
  • Soft Skills: Commitment to documentation, cross-team collaboration, and an automation-first mindset.

Bonus Skills:

  • Framework Familiarity: Previous experience working with the Laravel framework or our existing product suite (Cloud, Forge, Vapor, etc.).
  • Advanced Observability: Experience with Prometheus and Grafana Mimir for metrics storage and alerting.
  • Cost Optimization: Specialized knowledge in managing and optimizing resource usage and cloud costs.

Benefits:

  • Small, tight-knit team where every developer counts.
  • Fully remote and globally distributed working environment.
  • Option to attend Laracon conferences around the world.
  • Health care plan (Medical, Dental & Vision).
  • Paid time off (Vacation, Sick & Public holidays).
  • Family leave (Maternity, Paternity).
  • Pension plans (As locally applicable).
  • Performance based bonus plan.
  • Company equity.

Share

Laravel

Laravel

  • Address
    Remote
View Profile
Your experience on this site will be improved by allowing cookies Cookie Policy