About
I'm a fifth year PhD student at the University of Pennsylvania advised by Professors Mayur Naik and Eric Wong. I interned at AWS AI (Fundamental Research Team) for the last two summers advised by Matthew Trager and Stefano Soatto, where I worked on uncertainty quantification and experience-guided reasoning for agents. My research is supported by the NSF Graduate Research Fellowship Program.
My research aims to make AI systems reliably reason, and overall behave as intended. I do this by interfacing foundation models with programs, creating agents and workflows which we can interpret, control, and verify. My position on the role of symbolic abstractions (programs) in the foundation model era is presented in a pre-print, and my follow-up work explores a core challenge in this space, reasoning using per-instance program synthesis. Recently, I've been exploring how to interpret and control foundation models in lightweight, targeted ways, and how to enable general AI systems (agents, workflows, and pipelines) to learn from experience so that they become more reliable and lower cost over time.
Research Summary
The Interface Between Foundation Models (FMs) and Programs
๐ง Concepts as symbols
Surface and use FM concepts as symbols in programs.
Pre-print '25
ICML '24
ICLR Tiny '23
NeurIPS XAIA '23
ACL '25
Show papers
-
SuperActivators: Only the Tail of the Distribution Contains Reliable Concept Signals
Pre-print 2025 -
Towards Compositionality in Concept Learning
ICML 2024 -
TopEx: Topic-based Explanations for Model Comparison
ICLR Tiny 2023 -
Rectifying Group Irregularities in Explanations for Distribution Shift
NeurIPS XAIA 2023 -
Towards Style Alignment in Cross-Cultural Translation
ACL 2025
๐ป Programs for reasoning
Improve model reasoning capabilities with programs.
Pre-print '25
Pre-print '25
NeurIPS '25
AACL '24
Show papers
-
The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models
Pre-print 2025 -
Experience-Guided Adaptation of Inference-Time Reasoning Strategies
Pre-print 2025 -
Once Upon an Input: Reasoning via Per-Instance Program Synthesis (PIPS)
NeurIPS 2025 -
Faithful Chain-of-Thought Reasoning
AACL 2024
๐๏ธ Program verification
Verify, debug, and control AI systems.
OOPSLA '24
NeurIPS MechInterp '25
Pre-print '25
Show papers
-
TorchQL: A Programming Framework for Integrity Constraints in ML
OOPSLA 2024 -
Where's the Bug? Attention Probing for Scalable Fault Localization
NeurIPS MechInterp 2025 -
Instruction Following by Boosting Attention of LLMs
Pre-print 2025
Recent News
- 12/2025 Attending NeurIPS in San Diego to present PIPS in the main conference and three posters (two spotlights) in the Mechanistic Interpretability workshop!
- 10/2025 ๐จ Founded the Penn Agentic Lab: Leading a team of undergraduates to push the frontier of software reliability via Agentic Testing.
- 9/2025 ๐ Once Upon an Input: Reasoning via Per-Instance Program Synthesis (PIPS) accepted to NeurIPS 2025.
- 7/2025 Attended ACL in Vienna to present Towards Style Alignment in Cross-Cultural Translation.
Pre-Prints
-
Experience-Guided Adaptation of Inference-Time Reasoning Strategies
Pre-print, 2025
Adam Stein, Matthew Trager, Benjamin Bowman, Michael Kleinman, Aditya Chattopadhyay, Wei Xia, Stefano Soatto. -
The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models
[code]
Pre-print, 2025
Adam Stein, Aaditya Naik, Neelay Velingker, Mayur Naik, Eric Wong -
Instruction Following by Boosting Attention of Large Language Models
[blog] [code]
Pre-print, 2025
MechInterp Workshop @ NeurIPS, 2025 Spotlight Presentation
Vitoria Guardieiro*, Avishree Khare*, Adam Stein*, Eric Wong -
SuperActivators: Only the Tail of the Distribution Contains Reliable Concept Signals
[code]
Pre-print, 2025
MechInterp Workshop @ NeurIPS, 2025
Cassandra Goldberg, Chaehyeon Kim, Adam Stein, Eric Wong
Conference Papers
-
Once Upon an Input: Reasoning via Per-Instance Program Synthesis
[code] [demo]
NeurIPS 2025
Adam Stein*, Neelay Velingker*, Mayur Naik, Eric Wong -
Towards Style Alignment in Cross-Cultural Translation
ACL 2025
Shreya Havaldar*, Adam Stein*, Eric Wong, Lyle Ungar -
Towards Compositionality in Concept Learning
[blog] [code]
ICML 2024
Adam Stein, Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong -
TorchQL: A Programming Framework for Integrity Constraints in Machine Learning
[code]
OOPSLA 2024
Aaditya Naik, Adam Stein, Yinjun Wu, Eric Wong, Mayur Naik -
Relational Query Synthesis โจ Decision Tree Learning
VLDB 2024
Aaditya Naik, Aalok Thakkar, Adam Stein, Mayur Naik, Rajeev Alur -
Faithful Chain-of-Thought Reasoning
[blog] [code]
AACL 2024 Area Chair Award
Qing Lyu*, Shreya Havaldar*, Adam Stein*, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, Chris Callison-Burch -
TopEx: Topic-based Explanations for Model Comparison
ICLR (Tiny Papers Track) 2023
Shreya Havaldar, Adam Stein, Eric Wong, Lyle Ungar -
Learning to Select Pivotal Samples for Meta Re-weighting
[code]
AAAI 2023 Oral Presentation
Yinjun Wu, Adam Stein, Jacob Gardner, Mayur Naik
Workshop Papers
-
Where's the Bug? Attention Probing for Scalable Fault Localization
MechInterp Workshop @ NeurIPS, 2025 Spotlight Presentation
Adam Stein*, Arthur Wayne*, Aaditya Naik, Mayur Naik, Eric Wong -
Rectifying Group Irregularities in Explanations for Distribution Shift
[code]
XAIA @ NeurIPS, 2023
Adam Stein, Yinjun Wu, Eric Wong, Mayur Naik -
Some Problems with Properties: A Study on Property-Based Testing in Industry
HATRA @ SPLASH 2022
Harrison Goldstein, Joseph W. Cutler, Adam Stein, Benjamin C. Pierce, Andrew Head
Student Mentoring
- Arthur Wayne (Applying to PhD programs)
Teaching
- TA for CIS 547, Program Analysis (University of Pennsylvania, Fall 2023)
- TA for CIS 500, Software Foundations (University of Pennsylvania, Fall 2022)
- Tutor, Tau Beta Pi (University of California, Los Angeles, 2019)