Aaditya Naik
Final-Year PhD Student, AI for Software Engineering
Computer and Information Science
University of Pennsylvania
Email: asnaik@seas.upenn.edu

About Me

Welcome to my corner of the internet! I am a Ph.D. student at the University of Pennsylvania, advised by Prof. Mayur Naik. My research sits at the intersection of machine learning and software engineering, focusing on neurosymbolic approaches to AI-assisted software development. I develop neurosymbolic systems that integrate symbolic reasoning directly into ML architectures to enable foundation models to reason in a reliable and trustworthy manner.

My research has been supported by the Google PhD Fellowship in Programming Technology and Software Engineering since 2023. I've also had the privilege of working on related problems in the industry. At Microsoft Research, I worked with the RiSE team to develop a test-driven interactive code generation framework. At Oracle, I investigated and improved the capabilities of LLMs for solving constraint-optimization tasks.

📢 Update: I am on the academic job market! You can find my Research Statement and Teaching Statement here.

Research

My research focuses on developing neurosymbolic frameworks and solutions for addressing challenges in software engineering. Specifically, my research focuses on the following themes:
Research Structure
  • Neurosymbolic Agents for Software Engineering: To apply AI to software engineering, we must treat programs as structured graphs governed by control flow, not merely as text. My work in this area focuses on answering two important questions: how to effectively represent programs within the context of coding agents and how to generate verifiable guarantees to serve as feedback for model-generated code. My work on systems like CodeTrek [ICLR 2022] for neurosymbolic representations of programs, and Code2Inv [CAV 2020] for neurosymbolically verifying programs, aims to address these questions.
  • Program Synthesis for Coding Agents: Beyond code generation, coding agents require specifications for program safety and alignment. My work in this area focuses on synthesizing program safety specifications, such as in Sporq [UIST 2021], and model alignment specifications, like in SQRL [ICML 2023]. These works leverage foundational techniques from program synthesis that I have developed, like Libra [VLDB 2024], EGS [PLDI 2021], and GenSynth [AAAI 2021].
  • General-Purpose Neurosymbolic Frameworks: Implementing the verification and synthesis tools described above requires infrastructure that is robust, expressive, and can scale to complex logic over large datasets. I have developed frameworks for programming with foundation models, such as Dolphin [ICML 2025] for training and fine-tuning, and TorchQL [OOPSLA 2024] for analyzing and debugging models at inference time.

Publications

Preprints

On Improving Neurosymbolic Learning by Exploiting the Representation Space
Aaditya Naik, Efthymia Tsamoura, Mayur Naik, Dan Roth
The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models
Adam Stein, Aaditya Naik, Neelay Velingker, Mayur Naik, Eric Wong

Conference and Journal Publications

Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
Aaditya Naik, Jason Liu, Claire Wang, Saikat Dutta, Mayur Naik, Eric Wong
TorchQL: A Programming Framework for Integrity Constraints in Machine Learning
Aaditya Naik, Adam Stein, Yinjun Wu, Mayur Naik, Eric Wong
Towards Compositionality in Concept Learning.
Adam Stein, Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong
LLM-Based Test-Driven Interactive Code Generation: User Study and Empirical Evaluation.
Sarah Fakhoury, Aaditya Naik, Georgios Sakkas, Saikat Chakraborty, Shuvendu K. Lahiri
Relational Query Synthesis ⨝ Decision Tree Learning
Aaditya Naik, Aalok Thakkar, Adam Stein, Mayur Naik, Rajeev Alur
Do Machine Learning Models Learn Statistical Rules Inferred from Data?
Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong
CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation
Pardis Pashakhanloo, Aaditya Naik, Yuepeng Wang, Hanjun Dai, Petros Maniatis, Mayur Naik
Sporq: An Interactive Environment for Exploring Code Using Query-by-Example
Aaditya Naik, Jonathan Mendelson, Nathaniel Sands, Yuepeng Wang, Mayur Naik, Mukund Raghothaman
Example-Guided Synthesis of Relational Queries
Aalok Thakkar, Aaditya Naik, Nate Sands, Mukund Raghothaman, Mayur Naik, Rajeev Alur
GenSynth: Synthesizing Datalog Programs without Language Bias
Jonathan Mendelson*, Aaditya Naik*, Mukund Ragothaman, Mayur Naik
Code2Inv: A Deep Learning Framework for Program Verification
Xujie Si*, Aaditya Naik*, Hanjun Dai, Mayur Naik, Le Song

Workshop Papers

Where's the Bug? Attention Probing for Scalable Fault Localization
Adam Stein, Arthur Wayne, Aaditya Naik, Mayur Naik, Eric Wong
Do Machine Learning Models Learn Statistical Rules Inferred from Data?
Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong
Learning to Walk over Relational Graphs of Source Code
Pardis Pashakhanloo, Aaditya Naik, Hanjun Dai, Petros Maniatis, Mayur Naik