Research Engineer – Founding Platform Role

Permanent contract
Paris
A few days at home
Salary: Not specified
Experience: > 7 years
Education: Master's Degree

Sigma Nova
Sigma Nova

Interested in this job?

Questions and answers about the job

The position

Job description

Why This Role Exists

Until now, our scientists have written the code they needed to run experiments. Your job is to accelerate them: build shared training infrastructure, enforce good development practices, and set up the first wave of automation (CI/CD, experiment tracking, distributed training).

This is a founding role at the intersection of Research Engineering and MLOps. You won’t just “maintain pipelines” – you’ll design the foundations that make our research sustainable and deployable. Think: turning promising NeurIPS-level prototypes into clean, documented, versioned codebases that others can build upon.

You would work on opensource packages and would be mentionned as coauthor in the papers

You’ll lay the groundwork for the engineering and infrastructure teams that will follow. This is an individual contributor role with growth potential.

What You’ll Actually Do

1. Build shared ML infrastructure

  • Create reusable training pipelines and evaluation frameworks that both research teams can use with an emphasis on distributed training

  • Set up experiment tracking, model versioning, and reproducible environments

  • Build internal tooling to reduce friction in the research workflow

2. Bridge research and production

  • Partner with researchers to refactor promising prototypes into maintainable code

  • Establish coding standards and documentation practices (we want to improve research code quality, not police it)

  • Package models for deployment when they’re ready to move beyond research

3. Establish MLOps foundations

  • Set up CI/CD pipelines, testing frameworks, and deployment automation

  • Implement best practices for version control, code review, and reproducibility

  • Build the infrastructure for distributed training on our GPU clusters

4. Enable knowledge sharing

  • Create documentation and internal guides

  • Mentor researchers on software engineering practices (Git, testing, modular code)

  • Help establish a culture of building on each other’s work instead of starting from scratch

About the scope: We know this is ambitious for one person — and it’s intentional. This is a founding role where you’ll set the direction and priorities, not execute everything alone. We’re planning to grow the ML/MLOps team to 5-6 people over the next year, and you’ll play a key role in shaping what that team becomes.

We’re looking for someone with a builder mindset who sees this breadth as an opportunity: the chance to architect our ML infrastructure from the ground up, make high-impact decisions early, and grow into a leadership position as the team scales.

The Opportunity:

  • Greenfield infrastructure: Define how we build AI systems from the ground up – no legacy tech debt

  • High-leverage impact: Your infrastructure directly enables breakthrough research, not just incremental product features

  • Founding team member: Shape the engineering culture and practices that will scale with the company

  • Growth trajectory: As our first engineer, you’ll help build and potentially lead the platform team

  • Early-stage dynamics: Processes are being defined in real-time; you’ll need comfort with ambiguity and rapid iteration

  • Generalist demands: You’ll touch everything from training pipelines to deployment to documentation (specialization comes later)

Tech Stack (current):

  • PyTorch

  • Distributed Training (torchtitan)

  • Cloud GPU infrastructure

  • Early-stage tooling decisions are still open (you’ll help choose)


Preferred experience

Must-Haves

  • We are looking for exceptionnal people who have at least 6-7 years of experience and have seen a wide variety of roles/tasks within thje ML world (from Research Eng to MlOps)

  • Strong Python engineering: You write clean, tested, maintainable code by default (type hints, documentation, modular design) – and can teach others to do the same

  • PyTorch expertise: Deep familiarity with PyTorch for implementing and optimizing models; can debug researchers’ training code

  • MLOps fundamentals: Hands-on experience with Git workflows, CI/CD, Docker, experiment tracking tools (MLflow, Weights & Biases, etc.)

  • Distributed training: You’ve scaled training jobs across multiple GPUs or machines and understand the performance pitfalls (we are also happy to see you grow and learn it, if you haven’t done it already)

  • Bridge-builder mentality: You can work with brilliant researchers who sometimes write messy code, help them level up their software practices, and earn their trust.

  • Pragmatic autonomy: You’re comfortable scoping your own work, making pragmatic trade-offs between research velocity and engineering rigor, and asking for help when needed

  • Teaching ability: You can explain version control, testing, and modular design to scientists who’ve never used them – clearly and without condescension

Nice-to-Haves

  • Experience with generative AI or foundation models (LLMs, diffusion models, etc.)

  • Contributions to open-source ML projects (scikit-learn, Hugging Face, PyTorch ecosystem)

  • Cloud platform experience (AWS/GCP/Azure for ML workloads)

  • Know how to setup a slurm-based cluster

  • Systems programming skills (C++/CUDA for performance optimization)

  • Graduate degree or publications in ML/AI (but strong practical experience trumps credentials)

  • Experience working in a Resaerch lab, powering Research Scientists

Who Thrives in This Role

You might be a great fit if:

  • You’ve lived in both worlds: Worked in academic labs and startups, understand both cultures, and know how to blend research rigor with engineering pragmatism

  • You find satisfaction in cleanup: Refactoring a 3000-line project into a clean Python package feels rewarding, not tedious

  • You’re a technical Swiss Army knife: Equally comfortable debugging a PyTorch distributed training deadlock and designing a CI/CD pipeline from scratch

  • You’re an enabler, not a gatekeeper: You want to be the person who makes research teams 10x faster.

  • Ambiguity doesn’t paralyze you: When a researcher says “training is slow,” you can independently investigate, form hypotheses, and propose solutions

  • You respect the research: You understand that “messy research code” often represents months of brilliant problem-solving, and your job is to preserve the insights while improving the structure


Recruitment process

We will first review CV in batches every Wednesday..

When it comes to the actual detailed recrtuitment process we are still finalizing it and I’ll update this part asap but the idea is :

-Prescreen with Paul (Head of Talent)

-Technical interview (discussion) remote

-Onsite Interview (onsite)

Want to know more?

These job openings might interest you!

These companies are also recruiting for the position of “Engineering R&D”.