Senior Analyst, Customer Analytics and Technology Aug 2024 - Present
• Engineering scalable end-to-end ETL pipelines for 60+ email triggers using dbt, Kubernetes, AWS, Airflow, and Snowflake to process large-scale customer behavior data, generating $367M in revenue with less than 0.3% downtime.
• Working in the Advanced Analytics and Data Science team to develop robust ML recommendation models for affinity-based triggers to integrate behavioral insights from historical sales and clickstream data.
• Conducting A/B testing and multivariate experiments by deploying multiple email versions to segmented customer groups, visualizing detailed performance metrics (CTR, conversion rates) using Snowflake and Python, and finalizing high-performing templates to maximize ROI.
Data Engineering Intern Sep 2023 - Apr 2024
• Working in the Peacock ML Engineering team to fine-tune, containerize, and deploy an LLM on GCP using LangChain, Hugging Face Transformers, Vertex AI, Terraform, and LoRA to assist a team of 35 analysts with SQL query generation.
• The LLM is the first-of-its-kind at Peacock and reduces the data exploration time for data scientists by 70%, providing accurate information about relevant data sources and answering any questions related to the databases.
• Creating Python-based ETL pipelines for Peacock’s Machine Learning Engineering team using GCP, productizing the machine learning models using Airflow as a workflow orchestrator, and deploying them using Kubernetes.
Data Engineer Sep 2021 - Aug 2022
• Automated and maintained ETL pipelines, ingesting data across 14 separate sources using SQL, Python, and Informatica to streamline inventory and purchase order management for clients, reducing manual work by 90%.
• Developed cloud-based data integration solutions for 12 enterprise-level pipelines through GCP Pub/Sub, Dataflow, BigQuery, and Cloud Composer, reducing the maintenance and processing time by 68%.
• Gathered information via SQL queries from multiple production databases for research and reporting purposes and built impactful dashboards using GCP Looker to identify key metrics.
Data Analytics Intern May 2019 – Jun 2019
• Developed an LSTM model to predict app performance metrics using time-series data, resulting in accurate and actionable insights for the company.
• Built an internal reporting tool for customer retention using Angular, Flask, and MySQL, saving ∼40 hours per month.
• Created a Random Forest model to help the SEO team decide on the most impactful keywords, boosting website traffic by 50%, and contributing to increased brand visibility and lead generation for the company.
Computer Vision Intern Dec 2018 – Jan 2019
• Developed a UNet Deep Learning model for chest X-Ray image segmentation, resulting in a significant improvement of 5.2 in the DICE score compared to the existing CNN model.
• Parallelized the model using Spark’s MLlib library and deployed it on Hadoop clusters, reducing the overall data preprocessing and training time by 75%.
Download my CV for my detailed work experience as well as links to my publications and projects!