Cette offre n’est plus disponible.

PhD Student - Data Science

CDI
Paris
Salaire : Non spécifié
Télétravail non autorisé

Yassir
Yassir

Cette offre vous tente ?

jobs.faq.title

Le poste

Descriptif du poste

About Yassir

Yassir is the leading super App for on demand, ride-hailing, last-mile delivery, payment services and more, set to change the way daily services are provided. It currently operates in 45 cities across multiple countries. It has raised $150 million in Series B funding, five times what it raised in its previous priced round last November with world class investors such as BOND and Y Combinator, which is the precursor of the likes of Airbnb, Stripe, Dropbox, Doordash, among others. 

We’re not just about serving people - we’re about creating a marketplace to bring people what they need while infusing social values.

Global context and problematic of the subject

Recent progress in the field of deep learning has led to major advances in Natural Language Processing (NLP). Among its most complex tasks, sentiment analysis has also made great progress thanks to the possibility of training efficient neural models for understanding languages using dialogue data collected from different outlets (emails, SMS, comments, question answering, booking an event, etc.).

However, these models are still very limited when it comes to providing accurate results when analysing the Arabic language [1]. Arabic is a category IV language in parameters of complexity and difficulty to learn [2] and has a high ability for new word formations with rich semantic meanings [3]. Progress is being made regarding this  when it comes to standard arabic, some dialects that are more or less formal (Gulf Peninsular, Levantine, and Egyptian) [4]. However, none are made when it comes to Maghreb dialect (Morocco, Algeria, Tunisia)[4].

Scientific objective - results and obstacles to be overcomeThe objective of the thesis is to propose solutions to mutualise natural language understanding and sentiment analysis, that is to study the progressive fusion of various tasks mixing language diarization and language transliteration for Arabic representation or manipulation. The context of application will first be that of dialect written only in Arabic, then that of full form dialect with both latin and arabic characters and word from different languages.

The focus will be on the development of original and efficient learning strategies for the construction of these multi-task neural models, rather than testing existing models blindly. Among these strategies, the use of prompting techniques as well as attention models is highly anticipated [5-7].

The work will be based on local corpora developed using comments extracted from the different apps of YASSIR. In this respect, one of the main obstacles is labeling this data in meaningful way within the learning process, both in terms of the given sentiment and transliteration.

About your experience

  • Deep learning, machine learning
  • Natural language processing
  • Python, Shell
  • SQL
  • Optionally, knowledge extraction and management or graph databases
  • Google Cloud Platform
  • Good communication in English, both oral and written
  • Education required (Master's degree, engineering degree, PhD, scientific and technical field, etc.)
  • Degree: Master's degree or engineering degree
  • Field: Computer science or apple math with a focus on machine learning
  • Interested candidate need to send their transcripts from the last year of study (Master 2) and at least 3 reference contacts (e-mail and phone number) preferably of teacher/supervisor who are experts in the fields of machine learning, artificial intelligence or natural language processing.

    References

    M. Al-Ayyoub, A. A. Khamaiseh, Y. Jararweh, and M. N. Al-Kabi, “A comprehensive survey of arabic sentiment analysis,” Information Processing & Management, vol. 56, no. 2. Elsevier BV, pp. 320–342, Mar. 2019. doi: 10.1016/j.ipm.2018.07.006.

    Foreign Service Institute (FSI). 2021. Foreign Language Training. URL https://www.state.gov/foreign-language-training/.

    K. Shaalan, S. Siddiqui, M. Alkhatib, and A. Abdel Monem, “Challenges in Arabic Natural Language Processing,” Computational Linguistics, Speech and Image Processing for Arabic Language. WORLD SCIENTIFIC, pp. 59–83, Sep. 19, 2018. doi: 10.1142/9789813229396_0003.

    K. Meftouh, K. Smaili, and Nadjette Bouchemal, “A study of non-resourced language: the case of an Algerian dialect,” The third International Workshop on Spoken Languages Tech-nologies for Under-resourced Languages, vol12. 1-7. 2012, doi: 10.13140/RG.2.1.4881.1041.

    Liu, H., Paola Garcia Perera, L., Zhang, X., Dauwels, J., H Khong, A. W., Khudanpur, S., & Styles, S. J. (2021). End-to-End Language Diarization for Bilingual Code-Switching Speech. https://doi.org/10.21437/Interspeech.2021-82

    Chowdhury, S. A., Hussein, A., Abdelali, A., & Ali, A. (2021). Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR. https://doi.org/10.21437/Interspeech.2021-1809

    Jia, Y., Zen, H., Shen, J., Zhang, Y., & Wu, Y. (2021). PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS. https://doi.org/10.21437/Interspeech.2021-1757

    Process Candidates will be evaluated in 4 steps

    (no expectations are made)

    Phone interview/screening

    Technical interview with our applied researcher

    Programming assignment, candidates need to read a paper and implement it. 

    Reviewing the programming assignment.

    *As a company, we are passionate about diversity and inclusion, 40% of our team are women leaders in the tech sector. Research shows that women do not apply for jobs if they do not meet all of the requirements. We would like to hear from you if you feel you would be a good fit for us!

    Do you want to become part of our first-class team? Then you absolutely have to send us your application. 🚀

    PS: And if you want to stand out in your application, just let us know in your cover letter why we should have in our team. 💡

    Diversity & Inclusion & Engagement: 

    We celebrate diversity and are committed to creating an inclusive environment for all employees, as we believe diverse teams are more successful in the long term. We do not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, genetic information, gender, sexual orientation, gender identity or expression, and we encourage all people equally to apply for jobs with us.

    Envie d’en savoir plus ?

    D’autres offres vous correspondent !

    Ces entreprises recrutent aussi au poste de “Data / Business Intelligence”.

    Voir toutes les offres