Bhavana Mehta - Distributed Systems, Adaptive Databases, ML for Infrastructure

About

I am a Ph.D. candidate in Computer & Information Science at the University of Pennsylvania, advised by Prof. Boon Thau Loo and co-advised by Prof. Mohammad Javad Amiri. I also collaborate closely with Prof. Ryan Marcus on adaptive data management and distributed learning systems.

My research focuses on high-performance distributed databases, intelligent blockchains, and machine learning for systems optimization. I develop algorithms and systems that make data-intensive applications more efficient, adaptive, and reliable—combining rigorous quantitative analysis with practical engineering solutions.

📢 Currently on the job market! I'm seeking full-time industry positions starting 2025! Please reach out at bhavanam@upenn.edu

Research Interests

My research interests span distributed systems, machine learning for systems, and high-performance computing:

Adaptive Database Systems: Designing self-tuning database architectures that dynamically adjust partitioning strategies, indexing, and replication in response to changing workloads and system conditions
Quantitative System Optimization: Applying mathematical modeling, reinforcement learning, and statistical methods to optimize distributed system performance and resource allocation
Scalable Distributed Systems: Creating highly available, fault-tolerant protocols with strong consistency guarantees for mission-critical applications
ML for Systems Infrastructure: Developing ML-powered techniques that automatically tune and optimize system configurations to maximize throughput and minimize latency

Key Projects

Marlin: Byzantine-Resilient Sharding System

Built Byzantine-resilient sharding system with hypergraph partitioning and RL-based resharding agent in Python/Rust; handles 16K TPS under adversarial conditions with ACID guarantees.

Technologies: Python, Rust, Reinforcement Learning, Byzantine Fault Tolerance

AdaChain: Adaptive Blockchain Orchestrator

Created adaptive blockchain orchestrator using multi-armed bandit algorithms for dynamic pipeline selection; achieved 22% throughput increase and 40% latency reduction.

Technologies: Multi-armed Bandits, Blockchain, Performance Optimization

Capybara: DPDK-based TCP Stack

Collaborated with Microsoft Research on DPDK-based TCP stack for 4 µs live-connection migration; wrote C++ and Python modules for seamless data transfer.

Technologies: DPDK, C++, Python, TCP Migration, Microsoft Research

Distributed Training Infrastructure

Built distributed training infrastructure from scratch in Rust and Python, working with a team of 5 researchers to handle GPU cluster orchestration across heterogeneous hardware—achieved 30% throughput improvements.

Technologies: Rust, Python, GPU Clusters, Infrastructure

Selected Publications

Paper on Adaptive Distributed Systems Under Submission
Towards Full Stack Adaptivity in Permissioned Blockchains VLDB '24
Chenyuan Wu, Mohammad J Amiri, Haoyun Qin, Bhavana Mehta, Ryan Marcus, Boon Thau Loo
Towards Adaptive Fault‑Tolerant Sharded Databases VLDB '23
Bhavana Mehta, Neelesh C A, Prashanth Iyer, Mohammad J Amiri, Boon Thau Loo, Ryan Marcus
AdaChain: A Learned Adaptive Blockchain VLDB '23
Chenyuan Wu, Bhavana Mehta, Mohammad J Amiri, Ryan Marcus, Boon Thau Loo

View Full Publication List Google Scholar

Experience

Graduate Researcher

Sept 2019 - Present

University of Pennsylvania, Distributed Systems Lab

Built distributed training infrastructure from scratch in Rust and Python, working with a team of 5 researchers to handle GPU cluster orchestration across heterogeneous hardware—achieved 30% throughput improvements
Designed microservices control plane for job scheduling and data staging; scaled the system to 128 nodes with sub-millisecond heartbeat monitoring
Set up monitoring stack with Prometheus, Grafana, and OpenTelemetry to track cluster health and job performance; reduced time to detect failures by about 40%
Implemented automated fault injection testing for training pipelines under node failures and network partitions; improved system availability to 99.99% uptime
Created infrastructure-as-code using Terraform and Kubernetes to provision HPC clusters and cloud resources; reduced cluster setup time from days to 2 hours

Design Engineer

Jan 2018 - Jun 2019

Bluespec Inc., Boston, MA

Built configuration management system in Go for 500+ FPGA build variants with containerized jobs and task queues; improved build throughput by 25%
Automated CI/CD pipelines using Jenkins, Ansible, and Docker; reduced manual work in nightly builds by 80%
Worked with hardware team on kernel-bypass networking using DPDK for shuffle operations; cut interconnect overhead by 50 µs

Research Intern

May 2017 - Jul 2017

RISE Lab, Indian Institute of Technology Madras

Co-designed FPGA pipeline for 64-bit arithmetic achieving 281ns latency; built simulation framework for validation

Technical Skills

Languages

Rust, Go, C++, Python, Bash, SQL, Terraform

Distributed Systems

Microservices, gRPC, Kafka, Sharding, Consensus Protocols, Load Balancing

Infrastructure & Cloud

Kubernetes, Docker, Terraform, AWS, GCP, Linux, DPDK

Observability & DevOps

Prometheus, Grafana, OpenTelemetry, Jenkins, CI/CD, Chaos Engineering

Databases

PostgreSQL, Cassandra, Redis, ScyllaDB, Query Optimization