Laboratory presentation
CESI LINEACT (UR 7527), Laboratory for Digital Innovation for Businesses and Learning to Support the Competitiveness of Territories, anticipates and accompanies the technological mutations of sectors and services related to industry and construction. The historical proximity of CESI with companies is a determining element for our research activities. It has led us to focus our efforts on applied research close to companies and in partnership with them. A human-centered approach coupled with the use of technologies, as well as territorial networking and links with training, have enabled the construction of cross-cutting research; it puts humans, their needs and their uses, at the center of its issues and addresses the technological angle through these contributions.
Its research is organized according to two interdisciplinary scientific teams and several application areas.
Team 1 "Learning and Innovating" mainly concerns Cognitive Sciences, Social Sciences and Management Sciences, Training Techniques and those of Innovation. The main scientific objectives are the understanding of the effects of the environment, and more particularly of situations instrumented by technical objects (platforms, prototyping workshops, immersive systems...) on learning, creativity and innovation processes.
Team 2 "Engineering and Digital Tools" mainly concerns Digital Sciences and Engineering. The main scientific objectives focus on modeling, simulation, optimization and data analysis of cyber physical systems. Research work also focuses on decision support tools and on the study of human-system interactions in particular through digital twins coupled with virtual or augmented environments.
These two teams develop and cross their research in application areas such as: Industry 5.0, Construction 4.0 and Sustainable City, Digital Services.
Areas supported by research platforms, mainly that in Rouen dedicated to Factory 5.0 and those in Nanterre dedicated to Factory 5.0 and Construction 4.0.
Description
Industry 5.0 marks a new stage in the evolution of the industrial world, built on three key pillars: human-centricity, sustainability, and resilience. Rather than focusing solely on productivity, it emphasizes creating systems that respect human capabilities, reduce environmental impact, and remain robust in the face of disruptions. Re-centering the human in industrial systems therefore introduces several challenges (Nahavandi, 2019), particularly the need to design workspaces that are more ergonomic and compatible with human capabilities. The notion of tool affordance (Gibson, 1979), borrowed from the social sciences, provides a key framework for understanding how operators perceive the objects in their environment and how they interact with them. By analyzing these interactions, it becomes possible to design technologies that are more intuitive, adapted, and genuinely human-centred.
To achieve this, an important component is the accurate detection of objects, especially tools used during industrial tasks. Modern approaches rely on deep learning techniques (Trigka & Dritsas, 2025), which typically require large amounts of annotated data to reach high performance. However, collecting and manually labelling real-world industrial datasets is costly, time-consuming, and often impractical due to production schedules, safety and confidentiality constraints. This lack of real-world data represents a major limitation for training robust detection models in industrial environments. In this context, synthetic data generated in virtual environments has demonstrated advantages by providing large-scale, perfectly annotated training samples that can improve detection robustness when real data is scarce (Ouarab et al., 2024; Ouarab et al., 2025a; Ouarab et al., 2025b).
Beyond the choice of detection algorithm, performance strongly depends on the camera configuration deployed in the workstation. In real industrial setups, deciding whether to use one or multiple cameras and determining their positions, orientations and viewpoints remains a major challenge. This is particularly critical when the scene contains objects of different sizes, including small tools, which are harder to detect. As a result, identifying the best camera configuration by trial-and-error would require testing a very large number of combinations, which is impractical in real life.
Under these constraints, virtual environments and synthetic data generation provide a promising direction to explore camera setups efficiently and at low cost. By leveraging a simulated industrial workstation, it becomes possible to systematically test different camera configurations (number of cameras and camera viewpoints) under realistic variations, and to compare their impact on detection performance. To avoid an exhaustive and impractical trial-and-error process, this internship will explore a range of camera configurations (in terms of number of cameras and viewpoints) and evaluate them across multiple object-detection models. A multi-objective optimization approach based on NSGA-III will be used to identify the best trade-offs between:
Minimizing the number of cameras (frugality and deployability),
Maximizing detection performance (e.g., mAP@50) for each tool,
Minimizing the disparity between classes (reducing the standard deviation of per-class mAP).
Finally, the best configuration/model combination identified in simulation will be validated on real recordings to confirm performance under real industrial conditions and quantify the domain gap.
Work program
Step 1: Literature Review - Review object detection, synthetic data in Unity, and evaluation metrics.
Step 2: Dataset Preparation - Prepare the datasets and define the evaluation protocol.
Step 3: Benchmarking on Synthetic Data - Train and compare multiple detection models on synthetic data to select the best one.
Step 4: Real-world Validation - Test the selected model on real industrial recordings to confirm its performance under real-life conditions and analyze remaining limitations.
Step 5: Reporting and Final Presentation - Write the report and prepare the final presentation with results and conclusions.
Requirements:
Required Profile
Student in the final year of a Master’s program or engineering school, specializing in computer science, computer vision, artificial intelligence, industrial engineering, or a related field.
Knowledge of Python programming, basic image processing, and fundamentals of machine learning.
Experience with the Unity environment is a plus.
Ability to work autonomously and rigorously, while also collaborating effectively within a multidisciplinary research team.
Good written and oral communication skills, especially for scientific writing and presenting research results.
Workplace : CESI Saint-Etienne du Rouvray
Start date : February 2026
Duration : 5 to 6 months
Rencontrez Yohan, Directeur de la Recherche et de la Valorisation
Rencontrez Caroline, Enseignante Responsable Pédagogique, Pilote MS® Management de Projets de Construction