Cette offre n’est plus disponible.

Site Reliability Engineer (Devops), Instances

CDI
Paris
Salaire : Non spécifié
Télétravail total
Expérience : > 2 ans

Scaleway
Scaleway

Cette offre vous tente ?

jobs.faq.title

Le poste

Descriptif du poste

Context of the position
We are looking for a Site Reliability Engineer to join our Instances team. Your main mission will be to ensure we can reliably serve virtual machines for users around the world. We expect you to have a strong background in Python development and system administration, along with some DevOps practice experience. Our systems evolve constantly and the tools needed to observe and act to ensure their resilience need to evolve accordingly. You will need to be a problem solver that is willing to collaborate, and who knows how to leverage knowledge of system interactions in his favour. Are you ready to look after our virtualisation system and strive to improve our users daily life? This is a unique opportunity to join Scaleway and ensure developers of any companies get the high-quality virtual instance service they need.

What you’ll be doing

  • Create or optimise existing tools & documentation that will help identify, diagnose and remediate production incidents, automating as much as possible
  • Troubleshoot high-impact issues working with multiple engineering teams (Storage, Network, Hardware)
  • Take on-call responsibilities, mitigate issues encountered in production and secure the best real-time answer to our customers
  • Ensure a high quality of service for our customers by leveraging observability and monitoring technologies
  • Manage lifecycle of hypervisors in production and take part to fleet-wide migration plan
  • Empower your team mates to swiftly integrate and deploy software components of our virtualisation system
  • Help implementing best practices in stability, resiliency, scalability, security and performance across our virtualisation system

What we expect from you

  • Experience in system programming with python, bash, …
  • Demonstrated ability to troubleshoot production systems failures
  • A great attitude and desire to work with a team
  • Passion for incremental improvements on tooling, love all things of automation
  • Experience with Linux systems: Ubuntu server
  • Experience with virtualization: qemu/kvm
  • Good understanding of computer networks: TCP/IP, DNS, load-balancing, IPv6, BGP and network virtualisation

Nice to have

  • Experience with infrastructure as code and continuous deployment
  • Experience dealing with physical hardware automation
  • Experience with monitoring & logging systems
  • Experience administering relational databases
  • Knowledge of one cloud platform and related use-cases
  • Experience as an OSS contributor or maintainer

Technical stack & tools we use

  • Python
  • RabbitMQ + Celery
  • PostgreSQL + SQLAlchemy
  • HA Proxy, Nginx, REST APIs / Flask
  • S3 API
  • Sentry, Prometheus, Grafana, ElasticSearch, Fluentd, Kibana
  • Ansible, AWX, Foreman
  • GitLab, Nexus
  • Ubuntu, Debian, CentOS
  • Jira, Confluence, Slack, GSuite

You recognize yourself by reading these lines and you want to join a young, innovative, growing company where it is good to work ?

Then don’t wait any longer and join us :)

This position can be based in Paris, Lille or full-time remote

Envie d’en savoir plus ?

D’autres offres vous correspondent !

Ces entreprises recrutent aussi au poste de “Cloud Computing and DevOps”.

Voir toutes les offres