L'envoi d'un CV est-il obligatoire pour postuler à cette offre ?

Pour postuler à cette offre, l'envoi de votre CV est obligatoire.

Le télétravail est-il possible pour ce poste ?

Le télétravail est autorisé pour ce poste.

Quel est le type de contrat pour ce poste ?

Le contrat pour ce poste est de type {contract_type}.

Une lettre de motivation est-elle obligatoire pour postuler à cette offre ?

La lettre de motivation est obligatoire pour postuler à cette offre.

AI Data Engineer H/F - CAST

CAST

AI Data Engineer H/F

CDI

Meudon

Télétravail fréquent

Salaire : Non spécifié

Expérience : > 3 ans

Éducation : Bac +5 / Master

le mois dernier

CAST

Cette offre vous tente ?

Questions et réponses sur l'offre

Le poste

Descriptif du poste

Context

At CAST the world leader in Software Intelligence, we are building the foundation to ground AI with AAA data — Aggregated, Accurate, and Augmented — sourced from real-world software and technology projects.

We go beyond manual curation: this role is about using AI to empower AI.
You will design intelligent pipelines leveraging LLMs, embeddings, and NLP tools to clean, enrich, and validate data, ensuring that AI systems and autonomous agents can rely on it for training, reasoning and contextual understanding.

Your Mission

As a Data Engineer specialized in AI Enablement, you will be responsible for building robust, intelligent, and traceable data pipelines that power AI models and agents with high-quality, semantically rich information.

Your Responsibilities:

Aggregate and structure data from diverse software ecosystems (codebases, APIs, tickets, documentation, architecture specs).
Apply LLMs, embeddings, and NLP techniques to automate data cleaning, entity extraction, metadata tagging, and semantic annotation.
Build and maintain semantic data pipelines for LLM fine-tuning and Retrieval-Augmented Generation (RAG).
Organize datasets for Agent-to-Agent (A2A) interactions using APIs, vector databases, and knowledge graphs.
Collaborate with AI research and engineering teams to evolve schemas, prompts, labeling strategies, and evaluation datasets.
Ensure data lineage, reproducibility, and version control across all workflows.

Profil recherché

Your Profile

We’re looking for a hands-on Data Engineer who understands both the rigor of data pipelines and the creativity of AI enablement.
You’re analytical, curious, and passionate about leveraging AI to make data smarter.

Core Qualifications

Degree from a leading engineering school (Grande École) or equivalent university program.
3+ years of experience in data engineering, ML data operations, or structured data curation.
Proficiency in Python and data pipeline tools (Pandas, PyArrow, regex, Airflow).
Experience with LLM or NLP frameworks (Hugging Face, spaCy, LangChain).
Ability to use AI to clean, enrich, classify, and organize technical or unstructured content.
Strong understanding of tokenization, chunking, and model input preparation.
Experience working with software project data (Git repositories, APIs, documentation).

Bonus Skills

Knowledge of vector databases (FAISS, Qdrant, Weaviate) or knowledge graphs (Neo4j, RDF, SPARQL).
Exposure to agentic AI or autonomous AI frameworks (LangChain Agents, AutoGPT, OpenAgents).
Experience with RAG architectures, LLMOps, or prompt pipelines.
Background in software engineering or technical documentation.

Déroulement des entretiens

Recruitment Process

Our recruitment process consists of three steps:

Initial interview with our HR team.
Discussion with Guillaume, our Product Management Director, and Christophe, our R&D Director.
Final meeting to share our decision and next steps.
With us, the recruitment process moves quickly and efficiently!

Why Join Us?

Be part of a global AI innovation hub shaping the next generation of Software Intelligence.
Work at the intersection of data, AI, and software engineering, with real-world impact.
Collaborate with top AI experts and contribute to groundbreaking initiatives in AI enablement and automation.

Envie d’en savoir plus ?

Rencontrez Émile, Senior Software Engineer

Découvrez l'entreprise

Explorez la vitrine de l’entreprise ou suivez-la pour savoir si elle vous correspond vraiment !

Explorer l’entreprise

Ils sont sociables

L'entreprise

CAST

Logiciels, SaaS / Cloud Services

350 collaborateurs

Créée en 1990

Chiffre d'affaires : 54M €

Qui sont-ils ?

CAST, leader mondial de la “Software intelligence”, vous permet de comprendre le fonctionnement des systèmes logiciels les plus complexes. Depuis plus de 25 ans, nous développons des outils avancés capables d’analyser automatiquement l’architecture et le code des applications.

Notre objectif est d’aider les équipes techniques ainsi que les dirigeants à maîtriser la complexité, réduire la dette technique et accélérer la modernisation de leurs systèmes.

Avec une R&D solide ancrée en France et une présence internationale solide, CAST accompagne chaque année des centaines de grandes entreprises, tous secteurs confondus (50 % aux États-Unis, 40 % en Europe, 10 % en Inde et en Chine).

Si vous êtes passionné(e) par l’analyse sémantique du code, l’architecture logicielle et la création d’outils qui rendent l’invisible visible, CAST est l’endroit idéal pour évoluer, innover et avoir un impact concret.

Le lieu de travail

3 Rue Marcel Allégot, 92190 Meudon, France

Les +

Psst... On a plein de choses à vous dire sur les avantages que nous offrons à nos employés.

Découvrir

D’autres offres vous correspondent !

Ces entreprises recrutent aussi au poste de “Data / Business Intelligence”.

Consultant senior Data (H/F)
Klint
CDI
Levallois-Perret
Télétravail fréquent
Salaire : 55K à 70K €
Logiciels, Digital Marketing / Data Marketing
120 collaborateurs
il y a 2 heures
AI Engineer (H/F)
Gleamer
CDI
Paris
Télétravail fréquent
Logiciels, Intelligence artificielle / Machine Learning
120 collaborateurs
il y a 8 heures
Senior Solution Data Scientist
Dataiku
CDI
Paris
Télétravail non autorisé
Logiciels, Intelligence artificielle / Machine Learning
1 000 collaborateurs
avant-hier
Data Management & Quality Lead F/H
Thales
CDI
Vélizy-Villacoublay
Logiciels, Cybersécurité
80 000 collaborateurs
il y a 3 jours
Recrute activement !
Data Analyst (H/F) – GLS France
Tousfacteurs by GLS
CDI
Paris
Télétravail fréquent
Logiciels, Mobilité
il y a 3 jours
Senior Machine Learning Engineer
Zeffy
CDI
Paris
Télétravail fréquent
Salaire : 80K à 90K €
Logiciels, FinTech / InsurTech
41 collaborateurs
il y a 3 jours