You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Khadija TahriKT

Khadija Tahri

NLP & Speech AI Engineer|Transformers • ASR • LLM

€560/day
Nantes, FR
3-7 years

Average response time: 1 hour

About Khadija

Bonjour 👋

J’aide les entreprises à concevoir, entraîner et déployer des solutions d’IA et de machine learning, avec une spécialisation en NLP, reconnaissance vocale et IA générative.

Data Scientist & ML Engineer, j’interviens sur des projets end-to-end : collecte et préparation des données, entraînement et fine-tuning de modèles, évaluation, industrialisation et mise en production.

J’ai travaillé sur des problématiques variées, notamment dans des contextes peu documentés et multilingues, ce qui m’a permis de développer une forte autonomie technique et une capacité d’adaptation rapide.

Domaines d’intervention

  • NLP & IA générative : classification de textes, RAG, traduction automatique, transcription audio, fine-tuning de modèles Transformer (BERT, NLLB, Wav2Vec2, etc.)
  • Machine Learning & Deep Learning : CNN, Transformers, modèles d’attention, classification, régression, séries temporelles
  • Data Engineering & pipelines : collecte, annotation, preprocessing et structuration de données
  • MLOps & déploiement : MLflow, Docker, AWS, Azure, mise en production et monitoring de modèles

Stack technique

Python · PyTorch · TensorFlow · Hugging Face · Scikit-Learn · MLflow · Docker · AWS · Azure · PySpark · SQL
  • French

    Fluent

  • English

    Fluent

  • Arabic

    Native or bilingual

Remote only
Primarily works remotely

Experience

  • SiliconeSignal Technologies
    Data Scientist
    August 2022 - December 2025 (3 years and 4 months)
    Meknes, Morocco
    • Fine-tuned Wav2Vec2 models for Moroccan Darija speech-to-text transcription on low-resource audio datasets
    • Collected, annotated, and validated custom multilingual datasets for NLP and speech recognition tasks
    • Built a proprietary Darija ↔ English parallel corpus (7,300+ sentence pairs) for NLLB-200 translation fine-tuning
    • Designed and benchmarked Transformer and CNN-Attention architectures for sequential learning tasks
    • Developed an automated sleep stage classification pipeline using physiological signals (heart rate + respiration) on the MESA dataset (~2,000 subjects)
    • Built end-to-end data preprocessing and training pipelines for low-resource Arabic dialect NLP applications
    • Conducted model evaluation, experimentation, and performance optimization for production-oriented ML workflows
    Deep Learning LLM NLP Speech Recognition MLOps
  • AEVAWEB
    NLP Research Intern
    May 2022 - July 2022 (2 months)
    Oujda, Morocco
    • Developed a French Question-Answering system using CamemBERT and PyTorch
    • Implemented NLP preprocessing pipelines including POS tagging, lemmatization, and text normalization
    • Built and evaluated Transformer-based NLP models on Wikipedia-based QA tasks
  • INRAE
    Data Engineering Intern
    May 2021 - October 2021 (5 months)
    Aix-en-Provence, France
    • • Automated an end-to-end data collection pipeline for flood event monitoring across France, aggregating and structuring heterogeneous data sources

Recommendations

Be the first to recommend Khadija

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Master – Data Exploration & Business Intelligence (EID²)
    Galileo Institute, Sorbonne Paris Nord University
    2021
    Master – Data Exploration & Business Intelligence (EID²)
  • Bachelor of Science
    Moulay Ismail University
    2019
    Bachelor of Science

Categories