About
I am a Ph.D. candidate in Computer & Information Science at the University of Pennsylvania, advised by Prof. Boon Thau Loo and co-advised by Prof. Mohammad Javad Amiri. I also collaborate closely with Prof. Ryan Marcus on adaptive data management and distributed learning systems.
My research focuses on high-performance distributed databases, intelligent blockchains, and machine learning for systems optimization. I develop algorithms and systems that make data-intensive applications more efficient, adaptive, and reliableācombining rigorous quantitative analysis with practical engineering solutions.
š¢ Currently on the job market! I'm seeking full-time industry positions starting 2025! Please reach out at bhavanam@upenn.edu
Research Interests
My research interests span distributed systems, machine learning for systems, and high-performance computing:
- Adaptive Database Systems: Designing self-tuning database architectures that dynamically adjust partitioning strategies, indexing, and replication in response to changing workloads and system conditions
- Quantitative System Optimization: Applying mathematical modeling, reinforcement learning, and statistical methods to optimize distributed system performance and resource allocation
- Scalable Distributed Systems: Creating highly available, fault-tolerant protocols with strong consistency guarantees for mission-critical applications
- ML for Systems Infrastructure: Developing ML-powered techniques that automatically tune and optimize system configurations to maximize throughput and minimize latency
Key Projects
Built Byzantine-resilient sharding system with hypergraph partitioning and RL-based resharding agent in Python/Rust; handles 16K TPS under adversarial conditions with ACID guarantees.
Technologies: Python, Rust, Reinforcement Learning, Byzantine Fault ToleranceCreated adaptive blockchain orchestrator using multi-armed bandit algorithms for dynamic pipeline selection; achieved 22% throughput increase and 40% latency reduction.
Technologies: Multi-armed Bandits, Blockchain, Performance OptimizationCollaborated with Microsoft Research on DPDK-based TCP stack for 4 µs live-connection migration; wrote C++ and Python modules for seamless data transfer.
Technologies: DPDK, C++, Python, TCP Migration, Microsoft ResearchBuilt distributed training infrastructure from scratch in Rust and Python, working with a team of 5 researchers to handle GPU cluster orchestration across heterogeneous hardwareāachieved 30% throughput improvements.
Technologies: Rust, Python, GPU Clusters, InfrastructureSelected Publications
- Paper on Adaptive Distributed Systems Under Submission
- Towards Full Stack Adaptivity in Permissioned Blockchains VLDB '24
- Towards Adaptive FaultāTolerant Sharded Databases VLDB '23
- AdaChain: A Learned Adaptive Blockchain VLDB '23
Experience
Graduate Researcher
Sept 2019 - Present
University of Pennsylvania, Distributed Systems Lab
- Built distributed training infrastructure from scratch in Rust and Python, working with a team of 5 researchers to handle GPU cluster orchestration across heterogeneous hardwareāachieved 30% throughput improvements
- Designed microservices control plane for job scheduling and data staging; scaled the system to 128 nodes with sub-millisecond heartbeat monitoring
- Set up monitoring stack with Prometheus, Grafana, and OpenTelemetry to track cluster health and job performance; reduced time to detect failures by about 40%
- Implemented automated fault injection testing for training pipelines under node failures and network partitions; improved system availability to 99.99% uptime
- Created infrastructure-as-code using Terraform and Kubernetes to provision HPC clusters and cloud resources; reduced cluster setup time from days to 2 hours
Design Engineer
Jan 2018 - Jun 2019
Bluespec Inc., Boston, MA
- Built configuration management system in Go for 500+ FPGA build variants with containerized jobs and task queues; improved build throughput by 25%
- Automated CI/CD pipelines using Jenkins, Ansible, and Docker; reduced manual work in nightly builds by 80%
- Worked with hardware team on kernel-bypass networking using DPDK for shuffle operations; cut interconnect overhead by 50 µs
Research Intern
May 2017 - Jul 2017
RISE Lab, Indian Institute of Technology Madras
- Co-designed FPGA pipeline for 64-bit arithmetic achieving 281ns latency; built simulation framework for validation
Technical Skills
Languages
Rust, Go, C++, Python, Bash, SQL, Terraform
Distributed Systems
Microservices, gRPC, Kafka, Sharding, Consensus Protocols, Load Balancing
Infrastructure & Cloud
Kubernetes, Docker, Terraform, AWS, GCP, Linux, DPDK
Observability & DevOps
Prometheus, Grafana, OpenTelemetry, Jenkins, CI/CD, Chaos Engineering
Databases
PostgreSQL, Cassandra, Redis, ScyllaDB, Query Optimization