Hello! I am a Ph.D. student at Harvard University, advised by Flavio du Pin Calmon. My expected graduation date is May 2026! I completed my undergraduate degree in Math and Computer Science at NYU Courant and I interned at Citadel LLC and Meta.
My research interest lies in Responsible and Trustworthy Machine Learning, and my work spans LLM watermarking, algorithmic fairness, multiplicity, and more. I contemplate the impacts of ML algorithms on various domains of society for different (exponentially-many) groups of people. I use tools and frameworks from Information Theory, Probability, and Statistics. I am always open for collaborations and can be reached via email!
Publications
- HeavyWater and SimplexWater: Watermarking Low-Entropy Text Distributions
Dor Tsur*, Carol Xuan Long*, Claudio M. Verdun, Hsiang Hsu, Chen-Fu Chen, Haim Permuter, Sajani Vithana, Flavio P Calmon
Under Review, 2025.TL/DR
Our goal is to design watermarks that optimally use side information to maximize detection accuracy and minmize distortion of generated text. We propose two watermarks **HeavyWater** and **SimplexWater** that achieve SOTA performance. Our theoretical analysis also reveals surprising new connections between LLM watermarking and **coding theory**.
- Optimized Couplings for Watermarking Large Language Models, (slides)
Carol Xuan Long*, Dor Tsur*, Claudio M. Verdun, Hsiang Hsu, Haim Permuter, Flavio P Calmon
IEEE International Symposium on Information Theory (ISIT), 2025.TL/DR
We argue that a key component in watermark design is generating a coupling between the side information shared with the watermark detector and a random partition of the LLM vocabulary. Our analysis identifies the optimal coupling and randomization strategy under the worst-case LLM next-token distribution that satisfies a min-entropy constraint. We propose the **Correlated-Channel watermarking scheme** --- a closed-form scheme that achieve high detection at zero distortion.
- Kernel Multiaccuracy, (slides)
Carol Xuan Long, Wael Alghamdi, Alexander Glynn, Yixuan Wu, Flavio P Calmon
Foundations of Responsible Computing (FORC), 2025.TL/DR
We connect multi-group notions with *Integral Probability Metrics*, and propose **KMAcc** --- a non-iterative, one-step optimization to correct multiaccuracy errors in the kernel space.
- Predictive Churn with the Set of Good Models
Jamelle Watson-Daniels, Flavio P Calmon, Alexander D’Amour, Carol Xuan Long, David C. Parkes, Berk Ustun\ Under Review, 2024.TL/DR
We study the effect of predictive churn - flip in predictions over ML model updates - through the lens of predictive multiplicity – i.e., the prevalence of conflicting predictions over the set of near-optimal models (the ε-Rashomon set).
- Multi-Group Proportional Representation in Retrieval
Alex Osterling, Claudio M Verdun, Carol Xuan Long, Alexander Glynn, Lucas Monteiro Paes, Sajani Vithana, Martina Cardone, Flavio P Calmon
Advances in Neural Information Processing Systems (NeurIPS), 2024.TL/DR
We introduce Multi-Group Proportional Representation (MPR), a novel metric that measures representation across intersectional groups. We propose practical methods and algorithms for estimating and ensuring MPR in image retrieval, with minimal compromise in retrieval accuracy.
- Individual Arbitrariness and Group Fairness
Carol Xuan Long, Hsiang Hsu, Wael Alghamdi, Flavio P Calmon
Advances in Neural Information Processing Systems (NeurIPS), 2023, Spotlight Paper.TL/DR
Fairness interventions in machine learning optimized solely for group fairness and accuracy can exacerbate predictive multiplicity. A third axis of ``arbitrariness'' should be considered when deploying models to aid decision-making in applications of individual-level impact.
- On the epistemic limits of personalized prediction
Lucas Monteiro Paes*, Carol Long*, Berk Ustun, Flavio Calmon (* Equal Contribution)
Advances in Neural Information Processing Systems (NeurIPS), 2022.TL/DR
It is impossible to reliably verify that a personalized classifier with $k \geq 19$ binary group attributes will benefit every group that provides personal data using a dataset of $n = 8 × 10^9$ samples – one for each person in the world.
Misc
Outside of work, I am a globaltrotter, dancer, and music-lover. Growing up as a swimmer, I enjoy sports. From completing a half-marathon and recovering from an ACL injury, for better or worse, I do have many stories to tell. Of course, I also love cooking Canton/Singaporean food and reading away in the comfort of home!