Quick Summary
At Laravel, we build the foundation that empowers millions of developers. We are seeking a Senior Site Reliability Engineer (SRE) to scale our mission by ensuring our global infrastructure remains elegant and reliable. If you thrive on managing multi-region Kubernetes clusters, building robust observability systems, and solving complex operational puzzles with code, join us.
Description of the Role
As a founding member of our dedicated SRE function, you will report directly to Florian Beer. This is a high-impact, autonomous role focused on designing and implementing the systems powering Laravel Cloud, Nightwatch, Forge, and Vapor. You will bridge development and operations, advocating for a blameless culture and shared responsibility for reliability across the organization.
Your 12-Month Mission
Your first year impact will be undeniable:
- First 30 Days: Stabilize incident response by creating comprehensive, actionable runbooks for core alerts.
- Day 60: Pioneer "observability as code" by migrating alert rules and dashboards into version control.
- Day 90: Establish clear, data-driven SLOs for all customer-facing products.
- Year One: Transform visibility using insightful dashboards company-wide and significantly reduce manual toil through sophisticated automation.
What You Will Do
- Architect Reliability: Establish SRE as a core function at Laravel, building fundamentals from the ground up.
- System Design: Design, build, and maintain multi-region Kubernetes infrastructure and global distributed systems.
- Automation: Solve operational challenges through software, reducing manual intervention (toil) for product teams.
- Observability: Design and implement monitoring, logging, and alerting systems using tools like Prometheus, Grafana, and Loki.
- Collaboration: Partner with product leads and SecOps to ensure shared reliability responsibility.
- Incident Response: Lead incident reviews and postmortems in a strictly blameless environment to foster continuous learning.
Requirements - What You Will Bring
- Infrastructure Mastery: Deep experience with Linux system administration and cloud platforms, specifically AWS.
- Orchestration & IaC: Proficiency with Kubernetes, Docker, and managing infrastructure via Terraform.
- Programming Skills: Ability to solve problems with software and scripting using PHP, Bash, or Go.
- Systems Thinking: A smart and passionate approach to troubleshooting, able to deconstruct complex systems.
- Reliability Mindset: Experience with SLO/SLI/SLA definition, capacity planning, and performance tuning.
- Soft Skills: Commitment to documentation, cross-team collaboration, and an automation-first mindset.
Requirements - Bonus Skills
- Framework Familiarity: Previous experience working with the Laravel framework or our existing product suite (Cloud, Forge, Vapor, etc.).
- Advanced Observability: Experience with Prometheus and Grafana Mimir for metrics storage and alerting.
- Cost Optimization: Specialized knowledge in managing and optimizing resource usage and cloud costs.
Benefits
- Small tight-knit team where every developer counts.
- Fully remote and globally distributed working environment.
- Option to attend Laracon conferences around the world.
- Health care plan (Medical, Dental & Vision).
- Paid time off (Vacation, Sick & Public holidays).
- Family leave (Maternity, Paternity).
- Pension plans (As locally applicable).
- Performance based bonus plan.
- Company equity.

