cover
Full Time

Remote Site Reliability Engineer (SRE)/ 1 week ago

Digistore24
Attractive
Application ends: 2026-03-25

Quick Summary

Digistore24 seeks a remote Site Reliability Engineer with over 3 years of IT operations experience and mandatory fluency in English and German. This role focuses on enhancing system reliability through automation (IaC, CI/CD with GitHub Workflows, Helm, Kustomize), performance optimization, capacity planning, and incident response using tools like Prometheus, Grafana, or ElasticSearch. Required technical skills include Kubernetes/container technology and cloud services (Google Cloud preferred); PHP experience is a plus. The position demands strong communication, collaboration, problem-solving, and self-organization, offering flexible full-time hours.

Are you an experienced Developer or DevOps Engineer seeking a remote Site Reliability Engineer (SRE) role? Join our internationally successful software and education company, Digistore24, and elevate our system reliability to the next level.

Important: Fluency in both ENGLISH and GERMAN is a mandatory requirement for this position. Please do not apply if you do not speak both languages.

About Digistore24

Digistore24 is one of Europe's fastest-growing tech companies, driven by a mission to shape the digital future. We empower individuals to share their knowledge online through our software and expertise, helping them achieve their entrepreneurial dreams. This provides millions with access to valuable information to reach their goals. To support our rapid growth, we are sustainably expanding our teams, collaborating with experts and strong personalities who align with our values, regardless of their location.

Your Role as a Site Reliability Engineer

  • Automation and Infrastructure as Code (IaC): Automate repetitive tasks, deployments, and system management to minimize human error and boost efficiency. This includes creating scripts, CI/CD pipelines, or automating infrastructure provisioning.
  • Reliability and Performance Optimization: Continuously enhance system uptime by identifying bottlenecks and optimizing system architecture.
  • Capacity Planning and Scaling: Assess and predict system resource requirements (CPU, memory, storage) to ensure infrastructure scalability with increasing demand. Implement auto-scaling solutions for seamless load spike management, maintaining performance under diverse conditions.
  • System Monitoring and Incident Response: Proactively monitor system performance, uptime, and reliability using tools like Prometheus, Grafana, or ElasticSearch. Detect and respond to issues before user impact. Manage and resolve incidents, outages, and failures swiftly to minimize downtime, including documentation, communication, and post-incident analysis.
  • Incident Postmortems and Continuous Improvement: Conduct root cause analysis (RCA) after incidents to understand failures and prevent recurrence. Implement fixes, improvements, and best practices derived from postmortems to enhance system reliability and reduce future incidents.

Benefits at Digistore24

  • Play a crucial role in shaping cutting-edge projects within a collaborative work environment.
  • Enjoy flexibility in working time and location.
  • Work from our partner's coworking spaces or your home office, ensuring uninterrupted internet access.
  • Access regular further education opportunities.
  • Benefit from the stability of a highly successful German high-tech company, product-funded, not investor-funded.
  • Join outcome-focused teams with a culture of direct feedback.
  • Receive modern equipment: Thinkpad or MacBook.
  • Be part of an international, collaborative team with strong cohesion.
  • Participate in spectacular team events across various European countries.
  • Experience autonomy from day one.
  • Contribute to a retirement scheme.
  • Work in a team on a first-name basis, without a dress code, and at eye level.
  • Flexible working hours from Monday to Friday (core working hours: 10 AM to 4 PM).

Skills & Qualifications for this SRE Role

  • Communication Mastery: Communicate precisely and empathetically, diffusing potential conflicts with a solution-oriented approach. Maintain the right tone with stakeholders, developers, and your team, even under pressure, seamlessly switching between German and English.
  • Collaboration Wizardry: Effectively collaborate with developers, stakeholders, and operations to align everyone. Understand diverse team challenges and find company-wide beneficial solutions.
  • Automation Sorcery: Champion automation to save time and reduce errors, implementing tools that boost team productivity.
  • Problem-Solving Genius: Deeply analyze problems, identify root causes, and devise solutions to prevent future incidents.
  • Self-organization: Thrive on autonomy, excelling at organizing and structuring complex projects while working remotely.

Technical Skills & Tech Stack

  • Kubernetes / Container Technology
  • CI/CD: Experience with Github Workflows, Helm, Kustomize.
  • Cloud Services: Preferably Google Cloud, but other cloud platforms are also acceptable.
  • Excellent German Language Skills: Superior spelling and grammar in German.
  • PHP Language Experience: A plus.

A Typical Day at Digistore24

Your day begins with a morning video call to discuss yesterday's progress and today's plans with your team. You prefer a structured approach, outlining your daily routine and goals. You consistently allocate time for the continuous development of our SRE processes, supported by your team.

During the daily team call, you report on priorities and blockers, receiving practical tips to overcome challenges. For several hours, you focus on developing improvements for auto-scaling, monitoring, and alerting, testing your ideas in practice. You document these successful principles for a one-on-one discussion with the Head of IT Operations.

After lunch, you assist a developer with a new CI/CD workflow, discussing requirements and providing an initial prototype. You then address a ticket to review an application's resource allocation, checking current utilization and adjusting the deployment as needed.

Upon discovering an endpoint missing from monitoring, you create a ticket and immediately implement the necessary code in the Terraform project to add it.

This Position is NOT for You If:

  • You do not identify with our company values.
  • You have less than 3 years of experience in IT operations.
  • You struggle with ownership and require detailed discussions with supervisors or colleagues for every task.
  • You have difficulty planning and prioritizing your tasks.
  • You do not enjoy finding solutions for complex problems.
  • You are not confident speaking both German AND English.

Our Values

Explore our company values here: https://careers.digistore24.com/kultur-und-werte. Please review them thoroughly. Are you ready to live them?

Share

Digistore24

Digistore24

  • Address
    London, England
View Profile
Your experience on this site will be improved by allowing cookies Cookie Policy