GitGuardian is a tech company, so engineering sits at the heart of everything we do. The department is working on solving challenging problems:
Scanning various data streams at scale to find secrets in them (scanning >10M code patches, messages or images daily)
Developing components that are deployed on our customers’ infrastructure to securely collect and map non-human identities
Training and deploying models and algorithms to surface, aggregate and contextualize rich metadata around each secret, then integrating those insights into the product without compromising user experience.
You’ll join our Machine Learning squad—a team of four engineers within our 50+-strong engineering department—working together to build and ship ML features for our products.
Today, our priority is helping SecOps who are using GitGuardian to prioritize and navigate incidents. Some incidents, if abused, can cause hundreds of millions of dollars in damage.
We deeply believe machine learning is essential to building an effective prioritization algorithm, and that this algorithm must leverage all available context—from information in the patch and repository to company-level and asset-level data. This is why we work closely with both the Secret Detection team, in charge of our secret detection engine, and the Incidents team, who owns the interface and incidents management in the app.
Your daily responsibilities will be to:
Write code daily to make our platform smarter, faster and more reliable.
Train, evaluate and iterate on models using our large multi-modal dataset
Drive end-to-end ML/AI projects from scoping and prototyping through deployment and monitoring
Level up our MLOps deployment for larger models at the scale we have and with the additional complexity of self hosted compatibility.
Bring expertise and best practices: define conventions, review code, and mentor junior engineers.
Contribute to the continuous improvement of our existing deployment pipelines, optimizing inference speed and any other ideas to improve our day to day and reliability.
Technical environment
Languages & frameworks: Python, PyTorch/Transformers, ONNX Runtime, BentoML, scikit-learn, LiteLLM
Data & orchestration: DVC, SkyPilot, Snowflake, Dagster
Main Application: Celery, Django, PostgreSQL, Redis
Infrastructure & Deployment: AWS, Kubernetes, ArgoCD, Gitlab
Collaboration: Slack, Linear, Notion
More details on our current stack here!
What makes this position unique?
GitGuardian is a tech oriented company with a mission: making the world safer for developers. Thanks to very talented engineers, we are selling a strong product to top level companies that have a high level of expectations. As a data driven company from day one, GitGuardian has more than 40B code patches in our DBs and we’ve been running our models at scale on a huge volume of data for years now!
If you think you match at least 70% of these criteria, please apply!
We are looking for a Senior ML Engineer with strong ML Ops and Software Engineering skills. Here’s what we consider essential for success in this role:
You have a fluent English & French level, being able to express ideas to engineers or non-tech stakeholders,
You have experience shipping models in production (5+ years as an ML Engineer),
You master core ML skills: PyTorch, Transformers, scikit-learn, designing custom training pipelines.
You are seasoned with the following ML Ops skills:
Experimentation Environment: DVC, SkyPilot, Dagster (or equivalent).
Model deployment: ONNX Runtime, BentoML (or equivalent) in cloud-native environments.
Infra & tooling: AWS, Kubernetes/ArgoCD, GitLab CI/CD, Docker.
Monitoring & reliability: Grafana, Sentry (or similar) for production ML.
You focus on building reusable and maintainable systems thanks to pragmatic planning, balancing quick wins with a long-term vision.
The following skills would strengthen your application but aren’t required:
Having deployed LLMs or agent-based systems at scale.
Having domain experience in cybersecurity/secrets detection.
Being familiar with PostgreSQL, Django, Celery.
Having built or maintained self-hosted/on-prem ML deployments.
1. Video call with a Talent Acquisition team member
To discover your professional project and evaluate if there could be a mutual match.
2. Technical interview with Engineers (1h30)
To evaluate your skills for the position and project yourself into the role.
– Live coding & ML system-design: model training, infra, monitoring, trade-offs.
3. Interview with your future manager
To know more about yourself, your achievements, and present to you the team.
– Deep dive on past projects, career goals, team fit.
4. Final interview with a Senior Engineering Manager
To detail our company’s vision and ambitions for the next couple of years.
Ces entreprises recrutent aussi au poste de “Données/Business Intelligence”.
Voir toutes les offres