Site Reliability Engineer - Apprenticeship

Work study(12 months)
Paris
Occasional remote
Salary: Not specified

Welcome to the Jungle
Welcome to the Jungle

Interested in this job?

Questions and answers about the job

The position

Job description

The SRE Apprentice will join the Platform Team to discover and contribute to the infrastructure and systems that ensure the reliability, performance, and security of our production environments. Under the mentorship of experienced SRE engineers, this apprenticesgip bridges learning and hands-on contribution, applying software engineering principles to real infrastructure and operational challenges.

This role involves close collaboration with the SRE team, Development teams, and other stakeholders to contribute to automation, observability improvements, and infrastructure-as-code practices. You will progressively gain autonomy on well-scoped projects while learning incident management, capacity planning, and reliability engineering fundamentals in a production context.

As an Apprentice you will report to the Platform Engineering Manager and you’ll be integrated within the Platform Team.


Key Responsibilities :

Technical Contribution & Learning

  • Participate alongside Development teams in infrastructure discussions, deployment processes, and operational requirements.

  • Contribute to monitoring, alerting, and observability improvements (dashboards, alerts, log hygiene).

  • Write and review Terraform / Terragrunt modules under supervision, learning Infrastructure-as-Code best practices.

  • Contribute to disaster recovery documentation and backup verification procedures.

Operational Excellence & Automation

  • Shadow and progressively contribute to incident response efforts, learning root cause analysis methodology.

  • Develop and improve runbooks and documentation for operational procedures.

  • Help ensure proper logging and monitoring coverage across systems.

  • Contribute to automation initiatives to reduce manual operations (scripts, tooling, pipeline improvements).

  • Learn and apply SRE practices (SLOs, error budgets, toil reduction) in day-to-day work.

Cross-team Collaboration & Knowledge Building

  • Work with development teams to understand and support operational readiness requirements.

  • Collaborate with the SRE team on infrastructure security measures.

  • Participate in knowledge sharing sessions and team rituals.

  • Document learnings, contribute to the team’s knowledge base, and share findings with peers.

  • Partner with team members to improve developer experience through tooling and documentation.



Preferred experience

  • You are a student in a Computer Science / Engineering program, looking for an apprenticeship of one year

  • You have solid fundamentals in systems and want to develop a strong hands-on technical focus in infrastructure and reliability.

  • Let’s show you our stack ! You don’t need to master it, but familiarity or curiosity about these tools is expected:

    • Our main cloud provider is AWS;

    • We use Kubernetes as our container orchestrator;

    • Our Infrastructure-as-Code is managed with Terraform and Terragrunt;

    • We use ArgoCD and CircleCI as our integration and deployment tools;

    • We use OpenTelemetry & Datadog to monitor our platforms;

    • Our applications run on GNU/Linux systems, like Debian.

  • You’re comfortable or eager to learn:

    • Working with Linux/Unix systems.

    • Understanding distributed systems fundamentals and cloud architectures.

    • Writing scripts (Bash, Python or equivalent) to automate tasks.

    • Learning incident response practices and structured troubleshooting.

    • Working in both French and English, in a hybrid/remote context.

  • You have strong problem-solving skills and a methodical approach to understanding how systems work.

  • You’re reliability-curious: genuinely interested in how production systems run, how failures happen, and how to build resilient infrastructure.

  • It’s not required, but having touched our tech stack (Ruby, Elixir, React.js) or contributed to personal/open-source infra projects is a significant advantage


Recruitment process

Step 1️⃣ : A 30-minutes interview with Lilia, Talent Acquisition Apprentice

Step 2️⃣ : A 45-minute interview focused on job skills assessment and value with Nicolas, Senior Site Reliability Engineer

Step 3️⃣ : A 1h values interview with Pascal, Platform Engineering Manager

Good luck !

Want to know more?