Quick Summary
Daxko develops software solutions for fitness, wellness, and community organizations, encompassing member management, payments, digital engagement, and analytics. We empower thousands of fitness businesses, from small studios to large associations, to optimize operations, enhance member experiences, and achieve sustainable revenue growth.
We are seeking a Manager of Site Reliability Engineering (SRE) to lead a team focused on building resilient systems and ensuring the smooth operation of critical services. In this role, you will guide a team responsible for the reliability, performance, and operational health of our production environments.
You will collaborate with engineering leaders to maintain secure, scalable, and available systems for the organizations and communities that rely on Daxko's platforms.
What You’ll Do
As the Manager of Site Reliability Engineering, you will lead a team dedicated to the operational reliability of Daxko’s production platforms. Your focus will be on creating stable, high-performing systems and enabling your team to continuously improve operational and support processes for our products.
Key responsibilities include:
- Leading and supporting a team responsible for production system reliability and performance, including:
- Setting clear performance expectations and goals for team members.
- Providing ongoing coaching and real-time feedback.
- Ensuring team members have necessary training and resources.
- Coordinating on-call rotations and operational coverage.
- Supporting the team during critical incidents and outages.
- Managing team staffing, including hiring and headcount planning.
- Prioritizing and coordinating work across operational initiatives, deployments, upgrades, and infrastructure improvements.
- Ensuring high levels of system uptime, data integrity, and operational stability.
- Partnering with Engineering Leads to align platform operations with product development needs.
- Maintaining business continuity across all production assets.
- Monitoring system health, performance, and capacity for proactive issue identification and resolution.
- Serving as a technical escalation point for complex infrastructure or platform challenges.
- Providing regular reporting on system availability, response times, and capacity trends.
- Ensuring operations meet security, compliance, and regulatory requirements.
- Supporting and coordinating the team’s on-call rotation and incident response processes.
- Continuously improving operational practices through automation, tooling, and monitoring.
Technologies You’ll Work With
Our platform leverages modern infrastructure and cloud technologies. Strong experience in the following areas is important:
- Linux-based systems
- Web server technologies (NGINX, PHP, Traefik, F5)
- Virtualization platforms such as VMware
- Cloud platforms including AWS and Azure
- Containerization and orchestration (Docker, Kubernetes, Dynos)
- Messaging and caching technologies (Redis, RabbitMQ)
A strong security mindset and experience implementing infrastructure security controls are essential.
What You Bring
We are looking for a thoughtful technical leader who excels at solving complex operational challenges and fostering engineer growth. Ideal candidates will possess:
- Strong analytical and problem-solving skills.
- Clear communication and collaboration skills.
- Experience leading teams in fast-moving technical environments.
- Ability to balance multiple priorities and make thoughtful decisions under pressure.
- Strong organizational and time management skills.
- A customer-focused mindset and commitment to system reliability.
- Bachelor’s degree in a technical discipline or equivalent professional experience.
- 3–5 years of experience leading or managing globally distributed engineering teams.
- 3–5 years of experience in a Site Reliability Engineering or similar infrastructure-focused role.
Preferred Experience
- Experience serving as a technical lead on infrastructure or platform teams.
- Experience with modern observability and monitoring tools, such as OpenTelemetry, Instana, LogicMonitor, PagerDuty, or OpsGenie.
- Experience with infrastructure and automation tooling such as GitLab CI, Jenkins, Chef, Terraform, Elasticsearch, Kubernetes, or Rancher.
- Scripting experience in Ruby, Python, or Bash.
- Familiarity with SOC, PCI, or GDPR compliance standards.
- Experience working with issue tracking and collaboration tools such as the Atlassian suite.
- Experience supporting or developing applications built with Java, PHP, or Node.
- Experience automating operational processes and repetitive tasks.
Daxko is committed to building a diverse workforce, embracing diversity in thought, perspective, age, ability, nationality, ethnicity, orientation, and gender. The varied skills, perspectives, ideas, and experiences of our team members contribute significantly to our purpose and values.
We prioritize our team members' well-being, reflected in our offices, benefits, and perks for full-time employees. Some notable benefits include:
- Flexible paid time off.
- Affordable health, dental, and vision insurance options.
- Monthly fitness reimbursement.
- 401(k) matching.
- New-Parent Paid Leave.
- Casual work environments.
- Flexible work - remote & hybrid.
All information will be kept confidential according to EEO guidelines.
This is a remote position.
Compensation is determined by demonstrated skills and competencies, with the upper half of the range typically reserved for internal growth. Some roles may qualify for bonuses, commissions, or other performance-based incentives. We also offer a comprehensive benefits package, recognition programs, and career growth opportunities.
The annual pay range for this role is $139,400 – $217,400.


