Hello! I am a final-year Ph.D. candidate at Harvard University, advised by Flavio du Pin Calmon. I develop robust and reliable solutions that enable trustworthy adoption of AI across critical domains. I am currently on the job market for positions starting in September 2026, with a preference for opportunities in Europe.
My research focuses on reliable, responsible and trustworthy Machine Learning, and my work spans GenAI Agents, supply chain management, LLM watermarking, algorithmic fairness, multiplicity, and more. I use tools and frameworks from Optimization, Information Theory, Probability, and Statistics. I am open for collaborations and can be reached via email!
I received my undergraduate degree in Math and Computer Science at NYU Courant and my industry experience includes Citadel LLC and Meta.
News
September 2025
- ๐ Can GenAI agents๐ค manage a supply chain? Lessons from the classical beer game. A joint team from Harvard Information Theory Lab, MIT Data Science Lab, and GeorgiaTech Scheller College of Business has built ๐๐ต๐ฒ ๐ณ๐ถ๐ฟ๐๐ ๐น๐ถ๐๐ฒ ๐๐ถ๐บ๐๐น๐ฎ๐๐ถ๐ผ๐ป ๐ผ๐ณ ๐๐ต๐ฒ ๐๐ฒ๐ฒ๐ฟ ๐๐ฎ๐บ๐ฒ ๐ฝ๐ผ๐๐ฒ๐ฟ๐ฒ๐ฑ ๐ฏ๐ ๐๐๐ ๐. Try the interactive simulation here.
July 2025
- ๐ค Presented Optimized Couplings for Watermarking Large Language Models (slides) at the IEEE International Symposium on Information Theory, University of Michigan.
Jun 2025
- ๐ค Presented Kernel Multiaccuracy (slides) at the Foundations of Responsible Computing, Stanford University.
Publications
- HeavyWater and SimplexWater: Watermarking Low-Entropy Text Distributions
Dor Tsur*, Carol Xuan Long*, Claudio M. Verdun, Hsiang Hsu, Chen-Fu Chen, Haim Permuter, Sajani Vithana, Flavio P Calmon
Under Review, 2025.TL/DR
Our goal is to design watermarks that optimally use side information to maximize detection accuracy and minmize distortion of generated text. We propose two watermarks **HeavyWater** and **SimplexWater** that achieve SOTA performance. Our theoretical analysis also reveals surprising new connections between LLM watermarking and **coding theory**.
- Optimized Couplings for Watermarking Large Language Models, (slides)
Carol Xuan Long*, Dor Tsur*, Claudio M. Verdun, Hsiang Hsu, Haim Permuter, Flavio P Calmon
IEEE International Symposium on Information Theory (ISIT), 2025.TL/DR
We argue that a key component in watermark design is generating a coupling between the side information shared with the watermark detector and a random partition of the LLM vocabulary. Our analysis identifies the optimal coupling and randomization strategy under the worst-case LLM next-token distribution that satisfies a min-entropy constraint. We propose the **Correlated-Channel watermarking scheme** --- a closed-form scheme that achieve high detection at zero distortion.
- Kernel Multiaccuracy, (slides)
Carol Xuan Long, Wael Alghamdi, Alexander Glynn, Yixuan Wu, Flavio P Calmon
Foundations of Responsible Computing (FORC), 2025.TL/DR
We connect multi-group notions with *Integral Probability Metrics*, and propose **KMAcc** --- a non-iterative, one-step optimization to correct multiaccuracy errors in the kernel space.
- Predictive Churn with the Set of Good Models
Jamelle Watson-Daniels, Flavio P Calmon, Alexander DโAmour, Carol Xuan Long, David C. Parkes, Berk Ustun\ Under Review, 2024.TL/DR
We study the effect of predictive churn - flip in predictions over ML model updates - through the lens of predictive multiplicity โ i.e., the prevalence of conflicting predictions over the set of near-optimal models (the ฮต-Rashomon set).
- Multi-Group Proportional Representation in Retrieval
Alex Osterling, Claudio M Verdun, Carol Xuan Long, Alexander Glynn, Lucas Monteiro Paes, Sajani Vithana, Martina Cardone, Flavio P Calmon
Advances in Neural Information Processing Systems (NeurIPS), 2024.TL/DR
We introduce Multi-Group Proportional Representation (MPR), a novel metric that measures representation across intersectional groups. We propose practical methods and algorithms for estimating and ensuring MPR in image retrieval, with minimal compromise in retrieval accuracy.
- Individual Arbitrariness and Group Fairness
Carol Xuan Long, Hsiang Hsu, Wael Alghamdi, Flavio P Calmon
Advances in Neural Information Processing Systems (NeurIPS), 2023, Spotlight Paper.TL/DR
Fairness interventions in machine learning optimized solely for group fairness and accuracy can exacerbate predictive multiplicity. A third axis of ``arbitrariness'' should be considered when deploying models to aid decision-making in applications of individual-level impact.
- On the epistemic limits of personalized prediction
Lucas Monteiro Paes*, Carol Long*, Berk Ustun, Flavio Calmon (* Equal Contribution)
Advances in Neural Information Processing Systems (NeurIPS), 2022.TL/DR
It is impossible to reliably verify that a personalized classifier with $k \geq 19$ binary group attributes will benefit every group that provides personal data using a dataset of $n = 8 ร 10^9$ samples โ one for each person in the world.
Misc
Outside of work, I am a globetrotter, dancer, and music-lover. Growing up as a swimmer, I enjoy sports. From completing a half-marathon and recovering from an ACL injury, Iโve collected many stories to tell (for better or worse!). Whenever I can, I head outdoors โ my top three U.S. national parks are Yellowstone, the Grand Canyon, and Mount Rainier.