About Praveen
English
Native or bilingual
Telugu
Native or bilingual
Experience
- The Apache Software FoundationCommitterJune 2014 - Today (12 years)Worked on the initial proposal and release of Pig-on-Spark project, which aims to support Apache Spark as a compute engine for Pig. More details on the project can be found on this umbrella jira - https://issues.apache.org/jira/ browse/PIG-4059
- Inflection AISenior Data EngineerMay 2022 - October 2022 (5 months)Hyderabad, Telangana, IndiaDeveloped scalable data pipelines using Apache Spark and PySpark to process and deduplicate over 12TB of web archive data stored in AWS S3. Implemented a fuzzy matching algorithm based on Jaccard similarity to identify and remove redundant records, enhancing data quality and reducing bias in large language model training. Deployed on AWS EMR clusters, achieving 80% accuracy in duplicate detection and significantly improving model generalization and computational efficiency.
- AmazonData EngineerE-COMMERCEJune 2017 - May 2020 (2 years and 11 months)Hyderabad, Telangana, IndiaWorked as a Data Engineer II, as part of the Supply Chain and Optimization Technology(SCOT) org of Amazon.
Recommendations
Be the first to recommend Praveen
Help this freelancer shine by sharing your experience working together.
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- Bachelor of Technology (Computer Science)Acharya Nagarjuna University2010Bachelor of Technology