Data Engineering
Data Engineering
Pipelines that surface failures instead of silently corrupting your data.
What I do
Deliverables
Pipeline design & implementation
Airflow DAGs, AWS Glue jobs, PySpark transforms. Batch and event-driven. Idempotent, retryable, and observable — not a tangle of cron jobs.
Data warehousing
Snowflake schema design, dimensional modeling, and ELT patterns. Built so your analytics team can query without paging you.
Data quality & observability
Tests at ingestion, schema evolution, and lineage that actually surfaces when something breaks. The thing that turns a pipeline from a liability into infrastructure.
Migration & consolidation
Lift legacy ETL onto modern tooling. Consolidate data scattered across five systems into a single warehouse your team can reason about.
Engagement
How I engage
Data engagements typically run 4–12 weeks. We start by mapping your sources, decide between batch and streaming for each, build the warehouse schema, then implement pipelines incrementally — one tested before the next is started. I work from India on US/EU business hours.
Proof
Recent work
Cloud-Native Microservices Platform
Architected a microservices ecosystem using Spring Boot and .NET Core with Protocol Buffers over gRPC, feeding data into Snowflake for analytics.
SabreMulti-Tenant Infrastructure Automation
Designed and automated deployment of infrastructure and services in an isolated multi-tenant architecture.
Stack
Tech I use
Fit
Who this is for
- Your data lives in five-plus places and reporting is still manual.
- You have a working pipeline that breaks weekly.
- You need to move from Hadoop or legacy ETL to a modern stack.
- You're about to hire a data team and want the foundation built first.
Let's talk about your project
Send a short brief — what you're building, where you are now, what you want help with — and I'll reply within a business day.
Get in touch