Hello! I am a Ph.D. student at Harvard University, advised by Flavio du Pin Calmon. My expected graduation date is May 2026! I completed my undergraduate degree in Math and Computer Science at NYU Courant and I interned at Citadel LLC and Meta.

My research interest lies in Responsible and Trustworthy Machine Learning, and my work spans LLM watermarking, algorithmic fairness, multiplicity, and more. I contemplate the impacts of ML algorithms on various domains of society for different (exponentially-many) groups of people. I use tools and frameworks from Information Theory, Probability, and Statistics. I am always open for collaborations and can be reached via email!

Publications

HeavyWater and SimplexWater: Watermarking Low-Entropy Text Distributions
Dor Tsur*, Carol Xuan Long*, Claudio M. Verdun, Hsiang Hsu, Chen-Fu Chen, Haim Permuter, Sajani Vithana, Flavio P Calmon
Under Review, 2025.
TL/DR
Our goal is to design watermarks that optimally use side information to maximize detection accuracy and minmize distortion of generated text. We propose two watermarks **HeavyWater** and **SimplexWater** that achieve SOTA performance. Our theoretical analysis also reveals surprising new connections between LLM watermarking and **coding theory**.
Optimized Couplings for Watermarking Large Language Models, (slides)
Carol Xuan Long*, Dor Tsur*, Claudio M. Verdun, Hsiang Hsu, Haim Permuter, Flavio P Calmon
IEEE International Symposium on Information Theory (ISIT), 2025.
TL/DR
We argue that a key component in watermark design is generating a coupling between the side information shared with the watermark detector and a random partition of the LLM vocabulary. Our analysis identifies the optimal coupling and randomization strategy under the worst-case LLM next-token distribution that satisfies a min-entropy constraint. We propose the **Correlated-Channel watermarking scheme** --- a closed-form scheme that achieve high detection at zero distortion.
Kernel Multiaccuracy, (slides)
Carol Xuan Long, Wael Alghamdi, Alexander Glynn, Yixuan Wu, Flavio P Calmon
Foundations of Responsible Computing (FORC), 2025.
TL/DR
We connect multi-group notions with *Integral Probability Metrics*, and propose **KMAcc** --- a non-iterative, one-step optimization to correct multiaccuracy errors in the kernel space.
Predictive Churn with the Set of Good Models
Jamelle Watson-Daniels, Flavio P Calmon, Alexander D’Amour, Carol Xuan Long, David C. Parkes, Berk Ustun\ Under Review, 2024.
TL/DR
We study the effect of predictive churn - flip in predictions over ML model updates - through the lens of predictive multiplicity – i.e., the prevalence of conflicting predictions over the set of near-optimal models (the ε-Rashomon set).
Multi-Group Proportional Representation in Retrieval
Alex Osterling, Claudio M Verdun, Carol Xuan Long, Alexander Glynn, Lucas Monteiro Paes, Sajani Vithana, Martina Cardone, Flavio P Calmon
Advances in Neural Information Processing Systems (NeurIPS), 2024.
TL/DR
We introduce Multi-Group Proportional Representation (MPR), a novel metric that measures representation across intersectional groups. We propose practical methods and algorithms for estimating and ensuring MPR in image retrieval, with minimal compromise in retrieval accuracy.
Individual Arbitrariness and Group Fairness
Carol Xuan Long, Hsiang Hsu, Wael Alghamdi, Flavio P Calmon
Advances in Neural Information Processing Systems (NeurIPS), 2023, Spotlight Paper.
TL/DR
Fairness interventions in machine learning optimized solely for group fairness and accuracy can exacerbate predictive multiplicity. A third axis of ``arbitrariness'' should be considered when deploying models to aid decision-making in applications of individual-level impact.

On the epistemic limits of personalized prediction
Lucas Monteiro Paes*, Carol Long*, Berk Ustun, Flavio Calmon (* Equal Contribution)
Advances in Neural Information Processing Systems (NeurIPS), 2022.
TL/DR
It is impossible to reliably verify that a personalized classifier with $k \geq 19$ binary group attributes will benefit every group that provides personal data using a dataset of $n = 8 × 10^9$ samples – one for each person in the world.

Misc

Outside of work, I am a globaltrotter, dancer, and music-lover. Growing up as a swimmer, I enjoy sports. From completing a half-marathon and recovering from an ACL injury, for better or worse, I do have many stories to tell. Of course, I also love cooking Canton/Singaporean food and reading away in the comfort of home!

Carol X. Long

Publications

Misc