About Raphael
English
Fluent
French
Native or bilingual
Experience
- Bloom social analyticsLead Data EngineerSOCIAL NETWORKSMay 2021 - March 2023 (1 year and 10 months)Manage the development of the new data processing platform within a team of 4 developers.The new platform is scalable to perform data analysis, enrichment and graph computation of multiples projects in parallel, each one containing from 1M to 40M documents to be processed.With the new architecture, the average processing time decreased from 14 hours to 3 hours, strongly reducing the number of failures during the processing workflow and allowing the end users to be more comfortable in their work of data analyze, allowing them to process more data by providing fully processed data in a reasonable delivery time.The new architecture mixes streaming and batch processing to provide a very fast orchestration of each analysis step.## Streaming architecture
- Microservices following the streaming enrichment pattern using Kafka as data source and output
- Subset of data flowing through each microservice can be easily invalidated without performing manual operation nor topic cleaning
- 0 data loss with a at least once consuming strategy
- Autoscaling managed by Kubernetes
## batch analysis architecture- Highly scalable data platform using AWS EMR autoscaling
- Predictable workflows with failure recovery using airflow pipelines
- Handle multiple datasources like Amazon S3, RDS/PostgreSQL, Elasticsearch, Kafka
- Idempotent jobs
- Pernod ricardCloud Data ArchitectWINE AND SPIRITSJanuary 2021 - May 2021 (4 months)Provide infrastructure support, guidelines and best practices to the data scientists teams in the building of their data platforms.As a result of their ambition to become an innovative data driven company, Pernod Ricard created the Data Center of Excellence in 2020 to increase their capabilities of developing well suited cloud data platforms.As a Data architect, I was in charge of providing the good architecture and tooling to allow data science teams to reduce their delivery time by speeding up their development and model training phase, and to design a robust architecture, to move from an advanced MVP to a scalable production grade product.
- BedrockData Team LeaderFILM AND AVSeptember 2019 - November 2020 (1 year and 2 months)Lyon, FranceLead de la Team core-data :
- design d'architecture technique
- organisation et suivi des développements :
- ingestion des données raw : Spark / Delta Lake (Scala) / AWS EMR
- intégration données partenaires : Spark / Scala / AWS EMR / AWS EKS
- transformations des données raw en données Core (Gold) et expositions de ces données aux clients: Apache Airflow (Python), Amazon Athena, Apache Superset
- accompagnement prestataire externe sur mise en œuvre d'algorithmes d'alerting en vue du renforcement de la data-quality
- évaluations solution spécifique de DataWarehouse (Snowflake) pour l'offre self-service analytics
Data Engineer Team A/B test (Spark/Scala/EMR) :- refonte application A/B test sur le modèle Functional Data Engineering
- mise en place d'un système générique de calcul des KPI
- migration Hadoop vers AWS
Recommendations
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- maitrise science politiquesuniversité Lyon 2 Lyon2006
- DUT informatiqueIUT A Lyon 12009
Certifications
- Big Data Analysis with Scala and Sparkcoursera2017
- Functional Programming Principles in Scalacoursera2017