About me
I'm a Data Engineer with 7+ years of experience designing large-scale data pipelines to support business decisions. I have successfully reduced report consolidation time from hours to minutes at companies like Biossance and Amazon, demonstrating my ability to deliver impactful data solutions.
I specialize in developing efficient batch, real-time streaming solutions, and cloud warehousing architectures using Apache Iceberg, dbt, Databricks, Snowflake, Kafka, Spark (PySpark).
Skills
Programming Languages
SQL, Python, Shell Scripting
Databases & Cloud Technologies
dbt Core, Trino, Apache Iceberg, Snowflake, Databricks, Apache Kafka, Apache Spark, GCP (CloudRun, CloudStorage, BigQuery, IAM), Parquet, MySQL, PostgreSQL, Redshift
Orchestration & Analytics Tools
Git, Apache Airflow, Docker, Tableau, Streamlit, DOMO, Google Analytics, Salesforce Analytics
AI Tools
RAG, Milvus vector database, Neo4j, LangChain, MCP
Portfolio
Resume
Experience
-
Data Engineer - Bill.com
Apr 2025 — Sep 2025- Migrated dbt models from Snowflake to medallion architecture in Apache Iceberg, saving ingestion costs and reducing time to insights.
- Optimized data models by removing inefficient joins and table scans by adding query pruning, predicate pushdown, and clustering.
- Increased productivity and data credibility by creating reusable unit and integration test scripts, cutting manual work by 25%.
-
Analytics Engineer - BigPanda
Nov 2024 — Mar 2025- Identified and resolved slow joins and full table scans by query profiling in Snowflake, increasing dashboard refresh time by 15%.
- Redesigned dbt models using materialized views, reducing full table scans and union all, and accelerated query runtime by 30%.
- Built a unified account health data model in dbt and Snowflake by integrating multi-source metrics, enabling consistent insights that improved subscription growth and reduced churn.
-
Data Engineer - Biossance.com
Oct 2020 — Aug 2023- Developed a data model using MySQL and ETL to unify sales from disparate data sources, reducing reporting time by 4 hours.
- Implemented data model and pipeline to unify retail sales across 10+ countries, reducing dashboard consolidation time by 4 hours.
- Received Catalyst Award for proactively identifying and resolving a critical business process that impacted weekly leadership reports.
-
Business Intelligence Engineer - Amazon.com
Oct 2018 — May 2020- Automated weekly dashboard with disparate data sources using Redshift and Tableau, reducing dashboard update time by 85%.
- Built a data pipeline in Redshift by computing feature scores for 3P sellers, which improved customer satisfaction and retention rates.
-
Business Intelligence Engineer - Micron Technology
Jun 2017 — Oct 2018- Built data pipeline and monitoring dashboard to track storage and shelf life which increased efficiency of manufacturing and operations.
Education
-
Data Engineering Bootcamp - DataExpert.io
May 2024 — Jul 2024Completed intensive bootcamp covering dimensional modeling, Trino, Kafka, Spark, DBT, Airflow, and Apache Iceberg. Won best capstone award for NYC Citibike streaming pipeline project delivering real-time bike status updates.
-
Data Engineering Bootcamp - Washington University in St. Louis
Aug 2023 — Jan 2024Completed 3 capstone projects building data pipelines using Python, Airflow, Apache Spark, Apache Kafka, and cloud platforms. Built Airflow pipeline using PySpark, CloudStorage and BigQuery to analyze billions of multi-exchange trading records.
-
University of Washington
2017Master of Science in Information Management
-
University of Mumbai
2012Bachelor of Science in Computer Engineering
Certifications
-
Google Cloud Certified Professional Data Engineer
Dec 2024 — Dec 2026 -
Databricks Certified Data Engineer Associate
Oct 2024 — Oct 2026
Contact
Thank you for visiting my portfolio! I'd love to connect with you.