The stack I live in

These are the GCP services I work with regularly — not a checkbox list, but tools I've used to solve real problems.

BigQuery

Data Warehouse

Dataflow

Stream & Batch Processing

Pub/Sub

Messaging & Streaming

Cloud Composer

Workflow Orchestration

Cloud Storage

Object Storage

Dataproc

Spark & Hadoop

Cloud Functions

Serverless Compute

Cloud Run

Containerized Workloads

Looker / Looker Studio

BI & Visualization

Vertex AI

ML Platform

IAM & Security

Access & Identity

Terraform (GCP)

Infrastructure as Code


Across the stack

Data Engineering

Pipeline design & architecture Streaming data Batch processing Data modeling ELT / ETL Data quality & validation Schema design Workflow orchestration

Languages & Tools

Python SQL Apache Beam Apache Airflow dbt Terraform Docker Bash / Shell Git

Practices

Infrastructure as Code CI/CD Observability & monitoring Cost optimization Data governance Technical documentation Code review

How I work

Reliability over speed

A pipeline that processes data correctly 100% of the time is more valuable than one that processes it twice as fast with silent failures. I build observability in from the start, not as an afterthought.

Cost-conscious by default

Cloud costs compound quickly without intent. I think about query efficiency, partition strategies, and data lifecycle from the design phase — not after someone gets a billing alert.

Infrastructure as code

If it can't be reviewed, version-controlled, and reproduced, it's a liability. Terraform for infrastructure, dbt for transformations, Airflow for orchestration — all of it in source control.

Documentation that lasts

I write documentation for the next person who has to understand the system at 2am when something breaks. The goal is clarity, not coverage. One sentence that explains the why beats five that describe the what.


See what I've built

Personal projects and open work, including The Prompt Kitchen.

View projects Get in touch