AI Data Engineer H/F

Plný úvazek
Meudon
Několik dní doma
Plat: Neuvedeno
zkušenosti: > 3 roky
Vzdělání: Magisterský stupeň vzdělání

CAST
CAST

Máte zájem o tuto nabídku?

Otázky a odpovědi ohledně nabídky

Pozice

Popis pozice

Context

At CAST the world leader in Software Intelligence, we are building the foundation to ground AI with AAA dataAggregated, Accurate, and Augmented — sourced from real-world software and technology projects.

We go beyond manual curation: this role is about using AI to empower AI.
You will design intelligent pipelines leveraging LLMs, embeddings, and NLP tools to clean, enrich, and validate data, ensuring that AI systems and autonomous agents can rely on it for training, reasoning and contextual understanding.

Your Mission

As a Data Engineer specialized in AI Enablement, you will be responsible for building robust, intelligent, and traceable data pipelines that power AI models and agents with high-quality, semantically rich information.

Your Responsibilities:

  • Aggregate and structure data from diverse software ecosystems (codebases, APIs, tickets, documentation, architecture specs).

  • Apply LLMs, embeddings, and NLP techniques to automate data cleaning, entity extraction, metadata tagging, and semantic annotation.

  • Build and maintain semantic data pipelines for LLM fine-tuning and Retrieval-Augmented Generation (RAG).

  • Organize datasets for Agent-to-Agent (A2A) interactions using APIs, vector databases, and knowledge graphs.

  • Collaborate with AI research and engineering teams to evolve schemas, prompts, labeling strategies, and evaluation datasets.

  • Ensure data lineage, reproducibility, and version control across all workflows.


Požadavky na pozici

Your Profile

We’re looking for a hands-on Data Engineer who understands both the rigor of data pipelines and the creativity of AI enablement.
You’re analytical, curious, and passionate about leveraging AI to make data smarter.

Core Qualifications

  • Degree from a leading engineering school (Grande École) or equivalent university program.

  • 3+ years of experience in data engineering, ML data operations, or structured data curation.

  • Proficiency in Python and data pipeline tools (Pandas, PyArrow, regex, Airflow).

  • Experience with LLM or NLP frameworks (Hugging Face, spaCy, LangChain).

  • Ability to use AI to clean, enrich, classify, and organize technical or unstructured content.

  • Strong understanding of tokenization, chunking, and model input preparation.

  • Experience working with software project data (Git repositories, APIs, documentation).

Bonus Skills

  • Knowledge of vector databases (FAISS, Qdrant, Weaviate) or knowledge graphs (Neo4j, RDF, SPARQL).

  • Exposure to agentic AI or autonomous AI frameworks (LangChain Agents, AutoGPT, OpenAgents).

  • Experience with RAG architectures, LLMOps, or prompt pipelines.

  • Background in software engineering or technical documentation.


Proces náboru

Recruitment Process

Our recruitment process consists of three steps:

  • Initial interview with our HR team.

  • Discussion with Guillaume, our Product Management Director, and Christophe, our R&D Director.

  • Final meeting to share our decision and next steps.

  • With us, the recruitment process moves quickly and efficiently!

Why Join Us?

  • Be part of a global AI innovation hub shaping the next generation of Software Intelligence.

  • Work at the intersection of data, AI, and software engineering, with real-world impact.

  • Collaborate with top AI experts and contribute to groundbreaking initiatives in AI enablement and automation.

Chcete se dozvědět více?

Tato volná pracovní místa by vás mohla zajímat!

Tyto společnosti rovněž nabírají pracovníky na pozici "{profese}".

  • Doctolib

    Senior Analytics Engineer (x/f/m)

    Doctolib
    Doctolib
    Plný úvazek
    Paris
    Několik dní doma
    Mobile Apps, Software
    2 800 zaměstnanci

  • Inato

    AI Engineer

    Inato
    Inato
    Plný úvazek
    Paris
    Plně vzdálený
    Plat: 65K až 75K €
    Software, Farmacie/ Biotechnologie
    70 zaměstnanci

  • Tata Consultancy Services - TCS

    Data Engineer SWIFT (H/F)

    Tata Consultancy Services - TCS
    Tata Consultancy Services - TCS
    Plný úvazek
    Suresnes, Puteaux
    Příležitostná práce z domova
    Software, IT / Digital
    600 000 zaměstnanci

  • Lenstra

    Senior Analytics Engineer

    Lenstra
    Lenstra
    Plný úvazek
    Paris
    Plně vzdálený
    Software, Artificial Intelligence / Machine Learning
    30 zaměstnanci

  • Dataiku

    Senior Solution Data Scientist

    Dataiku
    Dataiku
    Plný úvazek
    Paris
    Žádná práce na dálku
    Software, Artificial Intelligence / Machine Learning
    1 000 zaměstnanci

  • ModaResa

    Founding Engineer (with ML expertise) — future CTO track

    ModaResa
    ModaResa
    Plný úvazek
    Paris
    Příležitostná práce z domova
    Plat: 50K až 60K €
    Software, SaaS / Cloud Services
    9 zaměstnanci

Podívat se na všechny nabídky