Back to the overview

IONOS SE

Site Reliability Engineer (f/m/d)

At IONOS, the leading European provider of cloud infrastructure, cloud services and hosting services, you will work together with a wide range of teams. We are characterized by open structures, a friendly working culture and flat hierarchies with a strong team spirit. We firmly believe that work and fun are compatible, and offer you the right environment for this. Our constant growth means that we are always looking for new colleagues. Become part of IONOS and grow with us.

I want to apply now via online application form

We are seeking a highly skilled and experienced Site Reliability Engineer to join our team working on a 24/7shift basis. The Site Reliability Engineering L2 department operates all IONOS Cloud IaaS and PaaS services. As a Site Reliability Engineer, you will be responsible for ensuring the stability, security, and performance of our complex and distributed systems. You will work closely with our development teams to design, implement, and maintain scalable and reliable infrastructure, and to automate and optimize our systems and processes.

Tasks

  • to provide Technical level 2 support with direct customer contact.
  • Maintain monitoring, logging, and alerting solutions using tools such as Prometheus, Grafana, and Loki, to proactively detect blockers in shift rotation and contribute to resolving complex issues in distributed systems.
  • Troubleshoot network (LAN/WAN/VPN, DNS, DHCP) and storage systems (file/object/block), including provision, operation of highly available services on Linux and Kubernetes with Helm Charts.
  • Maintain Infrastructure as a Code, automation and playbooks using tools such as Ansible, Terraform, GitLab CI/CD, ArgoCD, and scripting languages like Bash, Python, and Go.
  • Collaborate with development teams to enhance processes and deployments, and to ensure smooth integration of new services and applications into our cloud and Kubernetes environment.
  • Ensure the stable and secure operation of our platforms, including management of incidents end-to-end, from initial analysis to resolution and follow-up through Problem Management.

Qualifications

  • Willingness to work in a 24x7 shift model that includes nights, weekends, and holidays with a strong problem-solving and troubleshooting approach to resolve complex technical problems.
  • You have multiple years of experience as a Site Reliability Engineer or in a related role (Linux System Administrator, Platform Engineer, DevOps/Infrastructure Engineer, Full Stack Developer).
  • Strong Experience with automation tools (e.g., Ansible, SaltStack), monitoring and observability tools (e.g., Prometheus, Grafana, Loki), and logging and alerting solutions (e.g., ELK Stack).
  • Strong Experience with virtualized environments, including Qemu/KVM, OpenStack, Proxmox, Cloud Storage technologies (File, Object, Block) and proficient knowledge of Docker & Kubernetes (K8s).
  • Proficiency in at least one programming or scripting language (e.g., Go, Python, Bash) for automation and monitoring tasks.
  • Experience with code management is required, with knowledge of merge conflicts, feature branches, merge requests, and continuous integration (CI/CD) being a plus.

Nice to have: 

  • Experience with RDMA, InfiniBand, and RoCE protocols.
  • Strong experience with Linux MD RAID (mdadm , sedadm) and LVM.
  • Proficiency in Linux performance tuning and network stack debugging (e.g., ethtool, perf, tcpdump, ibstat, ibtop).
  • Experience with S3, Ceph and software-defined networks.
  • Experience with established software development practices, including code reviews, build processes, packaging, and testing.

Language Skills: Must be fluent in German and English. At least B2 CEFR Level.

Location: Berlin
Note: At the end of the application process, candidates must undergo a security check. Your consent will be requested in good time during the process.

Benefits

  • Hybrid working model.
  • Shift working hours.
  • At some locations a subsidized canteen and various free drinks.
  • Modern office space with very good transport connections.
  • Various employee discounts for activities and products.
  • Employee events such as summer and winter parties, as well as workshops.
  • Numerous training and development opportunities.
  • Various health offers, such as sports and health courses.

Job info

Location: Berlin
Type: Full-time
Category: IT Software Development
Reference ID: 1220

About IONOS

IONOS is the leading European digitalization partner for small and medium-sized businesses (SMB). The company serves around six million customers and operates across 18 markets in Europe and North America, with its services being accessible worldwide. With its Web Presence & Productivity portfolio, IONOS acts as a 'one-stop shop' for all digitalization needs: from domains and web hosting to classic website builders and do-it-yourself solutions, from e-commerce to online marketing tools. In addition, the company offers Cloud Solutions to enterprises who are looking to move to the cloud as their businesses evolve. 

We value diversity and welcome all applications - regardless of, for example, gender, nationality, ethnic or social origin, religion, disability, age as well as sexual orientation and identity, physical characteristics, marital status or any other irrelevant factor subject to applicable law.

IONOS SE
Recruiting Team IONOS
Hinterm Hauptbahnhof 3-5
D-76135 Karlsruhe
Jobs@ionos.com