
National Internet Observatory, Northeastern University
Data Engineer / Backend Engineer
Oct 2024 — Present
Engineered and optimized a Prefect + Dask ETL pipeline migrating 5–10 million MongoDB network records per minute into PostgreSQL. Containerized the stack with Docker and shipped automated GitLab CI + Helm deployments on Kubernetes to process 5+ billion packets to date. Dockerized Superset, Grafana, and Prometheus for real-time observability across databases and warehouses. Built a Django-authenticated visualization app with FastAPI + Polars services and JWT-secured HS256 endpoints that helped secure $1M in funding. Designing an open-source Dask loader targeting 25 million records per minute, boosting throughput 2–3× over Dask-Mongo.



