Senior Linux System Administrator/SRE

Job summary
Permanent contract
Paris
Salary: Not specified
A few days at home
Skills & expertise
Collaboration and teamwork
Cloud infrastructure management
Centos
Kibana
Gitlab
+15
Apply

Scaleway
Scaleway

Interested in this job?

Apply
Questions and answers about the job

The position

Job description

Fondée en 1999, Scaleway est la filiale cloud du groupe Iliad, l’un des leaders des télécommunications en Europe. Notre mission est de favoriser une industrie numérique plus responsable en aidant les développeurs et les entreprises à créer, déployer et adapter des applications à n'importe quelle infrastructure.

Depuis nos bureaux situés à Paris et à Lille, nous perfectionnons quotidiennement l'écosystème cloud de Scaleway, dont nous sommes les premiers utilisateurs.

Nos quelques 25 000 clients nous choisissent pour notre redondance multi-AZ, notre expérience-utilisateur fluide, nos datacenters neutres en carbone ainsi que nos outils natifs de gestion d'architectures multi-cloud. Nos produits incluent des solutions entièrement gérées pour le bare metal, la conteneurisation et les architectures serverless, offrant ainsi un choix responsable dans le domaine du cloud computing.

Rejoignez notre équipe dynamique de près de 600 collaborateurs venant de divers horizons, dans un environnement stimulant et international alliant excellence technique, créativité et partage.

About the job

 

Reporting to our Engineering Manager Emerick Mounoury, you will be responsible to ensure we can reliably deliver virtual machines and bare metal servers to our users around the world.

We expect you to have a strong background in Python development and system administration, along with some DevOps experience and SRE practice. Our systems evolve constantly and the tools we use to monitor and ensure their resilience need to evolve accordingly.

Minimum qualifications

  • Experience in system programming using at least one of these languages: Python, Bash, Go, etc.
  • Demonstrated ability to troubleshoot production system failures
  • A positive mindset and desire to work with a team
  • Passion for automation and incremental improvements on tooling, 
  • Experience with Linux systems: Ubuntu server
  • Experience with virtualization: QEMU/KVM
  • Good understanding of computer networks: TCP/IP, DNS, load balancing, IPv6, firewall, BGP and network virtualization
  • Good command of English
  • Preferred qualifications

  • Ability to meticulously identify and solve any kind of bug in any codebase.
  • Experience with infrastructure-as-code and continuous deployment
  • Experience dealing with physical hardware automation
  • Experience monitoring & logging systems
  • Experience managing relational databases
  • Knowledge of at least one cloud platform and related use-cases
  • Experience as an OSS contributor and/or maintainer
  • Knowledge in HPC (High Performance Computing)
  • Responsibilities

  • Create or optimize existing tools & documentation that will help identify, diagnose, and solve production incidents, automating as much as possible
  • Troubleshoot high-impact issues by working with multiple Engineering teams (Storage, Network, Hardware)
  • Take on-call responsibilities, mitigate issues encountered in production and answer our customers in real time
  • Ensure a high quality of service for our customers by leveraging observability and monitoring technologies
  • Manage the life cycle of hypervisors in production and take part to the fleet-wide migration plan
  • Empower your teammates to swiftly integrate and deploy software components across our virtualisation system
  • Help implementing best stability, resiliency, scalability, security, and performance practices across our virtualisation system
  • Our Technical Stack

  • Python/Bash
  • RabbitMQ + Celery
  • PostgreSQL + SQLAlchemy
  • HA Proxy, Nginx, REST APIs / Flask
  • S3 API
  • Sentry, Prometheus, Grafana, ElasticSearch, Fluentd, Kibana
  • Ansible, AWX, Foreman
  • GitLab, Nexus
  • Ubuntu, Debian, CentOS
  • Jira, Confluence, Slack, GSuite
  • Location

    This position is based in our offices in Paris or Lille (France)

    Recruitment Process  

    Screening call - 30 mins with the recruiter 

    Manager Interview - 45 mins

    Technical Interviews 1h30mins

    HR Interview - 45 mins

    Offer sent - 48 hours

    Si vous ne vous voyez pas cocher toutes les cases, n'hésitez pas à postuler tout de même. Ne vous limitez pas à une description de poste - on ne sait jamais !

    🌐Scaleway | Scaleway Blog| Scaleway sur X

    Want to know more?

    These job openings might interest you!

    These companies are also recruiting for the position of “Network Engineering and Administration”.

    Apply