Senior Machine Learning Engineer - Paris (hybrid)

CDI
Paris
Salaire : Non spécifié
Télétravail fréquent
Expérience : > 5 ans
Postuler

GitGuardian
GitGuardian

Cette offre vous tente ?

Postuler
Questions et réponses sur l'offre

Le poste

Descriptif du poste

GitGuardian is a tech company, so engineering sits at the heart of everything we do. The department is working on solving challenging problems:

  1. Scanning various data streams at scale to find secrets in them (scanning >10M code patches, messages or images daily)

  2. Developing components that are deployed on our customers’ infrastructure to securely collect and map non-human identities

  3. Training and deploying models and algorithms to surface, aggregate and contextualize rich metadata around each secret, then integrating those insights into the product without compromising user experience.

You’ll join our Machine Learning squad—a team of four engineers within our 50+-strong engineering department—working together to build and ship ML features for our products.

Today, our priority is helping SecOps who are using GitGuardian to prioritize and navigate incidents. Some incidents, if abused, can cause hundreds of millions of dollars in damage.

We deeply believe machine learning is essential to building an effective prioritization algorithm, and that this algorithm must leverage all available context—from information in the patch and repository to company-level and asset-level data. This is why we work closely with both the Secret Detection team, in charge of our secret detection engine, and the Incidents team, who owns the interface and incidents management in the app.

Your daily responsibilities will be to:

  • Write code daily to make our platform smarter, faster and more reliable.

  • Train, evaluate and iterate on models using our large multi-modal dataset

  • Drive end-to-end ML/AI projects from scoping and prototyping through deployment and monitoring

  • Level up our MLOps deployment for larger models at the scale we have and with the additional complexity of self hosted compatibility.

  • Bring expertise and best practices: define conventions, review code, and mentor junior engineers.

  • Contribute to the continuous improvement of our existing deployment pipelines, optimizing inference speed and any other ideas to improve our day to day and reliability.

Technical environment

  • Languages & frameworks: Python, PyTorch/Transformers, ONNX Runtime, BentoML, scikit-learn, LiteLLM

  • Data & orchestration: DVC, SkyPilot, Snowflake, Dagster

  • Main Application: Celery, Django, PostgreSQL, Redis

  • Infrastructure & Deployment: AWS, Kubernetes, ArgoCD, Gitlab

  • Collaboration: Slack, Linear, Notion

More details on our current stack here!

What makes this position unique?

GitGuardian is a tech oriented company with a mission: making the world safer for developers. Thanks to very talented engineers, we are selling a strong product to top level companies that have a high level of expectations. As a data driven company from day one, GitGuardian has more than 40B code patches in our DBs and we’ve been running our models at scale on a huge volume of data for years now!


Profil recherché

If you think you match at least 70% of these criteria, please apply!

We are looking for a Senior ML Engineer with strong ML Ops and Software Engineering skills. Here’s what we consider essential for success in this role:

  • You have a fluent English & French level, being able to express ideas to engineers or non-tech stakeholders,

  • You have experience shipping models in production (5+ years as an ML Engineer),

  • You master core ML skills: PyTorch, Transformers, scikit-learn, designing custom training pipelines.

  • You are seasoned with the following ML Ops skills:

    • Experimentation Environment: DVC, SkyPilot, Dagster (or equivalent).

    • Model deployment: ONNX Runtime, BentoML (or equivalent) in cloud-native environments.

    • Infra & tooling: AWS, Kubernetes/ArgoCD, GitLab CI/CD, Docker.

    • Monitoring & reliability: Grafana, Sentry (or similar) for production ML.

  • You focus on building reusable and maintainable systems thanks to pragmatic planning, balancing quick wins with a long-term vision.

The following skills would strengthen your application but aren’t required:

  • Having deployed LLMs or agent-based systems at scale.

  • Having domain experience in cybersecurity/secrets detection.

  • Being familiar with PostgreSQL, Django, Celery.

  • Having built or maintained self-hosted/on-prem ML deployments.


Déroulement des entretiens

1. Video call with a Talent Acquisition team member

To discover your professional project and evaluate if there could be a mutual match.

2. Technical interview with Engineers (1h30)

To evaluate your skills for the position and project yourself into the role.
– Live coding & ML system-design: model training, infra, monitoring, trade-offs.

3. Interview with your future manager

To know more about yourself, your achievements, and present to you the team.
– Deep dive on past projects, career goals, team fit.

4. Final interview with a Senior Engineering Manager

To detail our company’s vision and ambitions for the next couple of years.

Envie d’en savoir plus ?

D’autres offres vous correspondent !

Ces entreprises recrutent aussi au poste de “Données/Business Intelligence”.

Voir toutes les offres
Postuler