Core scripting, systems programming, and advanced querying for data engineering.
Building robust batch and real-time pipelines at scale.
Multi-cloud architecture and modern warehousing solutions.
- Orchestration and Workflow: Apache Airflow, Prefect, Dagster
- Data Transformation: dbt, SQL
- Infrastructure and Deployment: Docker, Kubernetes, Terraform
- CI/CD: GitHub Actions, Azure DevOps, Jenkins
- Version Control: Advanced Git workflows
- Monitoring and Reliability: Structured logging, validation, alerting
- Documentation: Pipeline lineage, runbooks, data dictionaries
Designed and architected a modular ETL pipeline using Airflow, Docker, Python, and PostgreSQL with an interactive Streamlit interface.
Impact:
- Automated ingestion, transformation, and validation workflows
- Integrated LLM-powered metadata enrichment and reporting
- Containerized services for reproducibility and deployment readiness
Key Engineering Focus:
- DAG-based orchestration
- Idempotent transformations
- Structured logging and pipeline observability
- Production-grade project structure
Open to building scalable data systems and discussing distributed systems, cloud architecture, and systems programming.

