Short Bio
I am currently a researcher at Thinking Machines Lab. The information on this website is outdated. Previously, I was a postdoctoral fellow at Princeton Language and Intelligence(PLI), working with Chi Jin, Sanjeev Arora, and Danqi Chen. I did my PhD study in Tong Zhang's group.
Before starting my PhD studies, I served as a Senior Machine Learning Engineer at Alibaba from 2017 to 2021, a prominent company in China (which open-sourced Qwen series models). I experienced firsthand the impressive capabilities of machine learning and developed industrial-level applications. Concurrently, I gained insights into the inherent challenges and instability of deep models in industrial settings. During early years of my PhD study, I also worked on Out-of-Distribution Generalization problems, such as enabling an autonomous driving system trained on city roads to navigate country roads; ensuring AI diagnostic systems trained on data from one hospital can reliably predict patients from another hospital.
I was an awardee of Apple AI/ML PhD fellowship (2023) and Hong Kong PhD fellowship (2020).
News
July 2025, we released Goedel-Prover-V2 , ranking 1st on PutnamBench Leaderboard (again), significantly beating the previous SOTA Deepseek-Prover-V2-671B.
Feb 2025, I served as an Area Chair of ACL ARR .
Feb 2025, we released Goedel-Prover for automated theorem proofing, ranking 1st on PutnamBench Leaderboard.
Oct 2024, our method SelfMoA ranked 1st on AlpacaEval 2.0 Leaderboard.
Aug 2024, I joined Princeton Language and Intelligence as a Postdoc Fellow
Jun 2024, our paper R-tuning won the Outstanding Paper Award of NAACL.
Selected Papers
(* denotes equal or core contribution)
Pre-prints
-
Yong Lin*, Shange Tang*, Bohan Lyu*, Ziran Yang*, Jui-Hui Chung*, Haoyu Zhao*, Lai Jiang*, Yihan Geng*, Jiawei Ge, Jingruo Sun, Jiayun Wu , Jiri Gesi, David Acuna, Kaiyu Yang, Hongzhou Lin*, Yejin Choi, Danqi Chen, Sanjeev Arora, Chi Jin*
Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction.
Pre-prints
-
Wenzhe Li*, Yong Lin*, Mengzhou Xia, Chi Jin
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?
Pre-prints.
-
Yifan Hao*, Yong Lin*, Difan Zou, Tong Zhang.
On the Benefits of Over-parameterization for Out-of-Distribution Generalization.
Annals of Statistics in submission.
Publications
-
Yong Lin*, Shange Tang*, Bohan Lyu, Jiayun Wu, Hongzhou Lin, Kaiyu Yang, Jia Li, Mengzhou Xia, Danqi Chen, Sanjeev Arora, Chi Jin
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving.
COLM 2025.
-
Hanning Zhang, Pengcheng Wang, Shizhe Diao,Yong Lin, Rui Pan, Hanze Dong, Dylan Zhang, Pavlo Molchanov, Tong Zhang
Entropy-Regularized Process Reward Model.
TMLR.
-
Yong Lin*, Chen Liu*, Chenlu Ye*, Qing Lian, Yuan Yao, Tong Zhang.
Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning.
JMLR (accepted).
-
Qizhou Wang*, Yong Lin*, Yongqiang Chen*, Ludwig Schmidt, Bo Han, Tong Zhang
Do CLIPs Always Generalize Better than ImageNet Models?
NeurIPS 2024.
-
Yong Lin*, Hangyu Lin*, Wei Xiong*, Shizhe Diao*,[+8 authors], Han Zhao , Nan Jiang, Heng Ji, Yuan Yao, and Tong Zhang.
Mitigating the Alignment Tax of RLHF.
ENMLP 2024. [ code ]
-
Yong Lin*, Skyler Seto*, Maartje ter Hoeve, Katherine Metcalf, Barry-John Theobald, Xuan Wang, Yizhe Zhang, Chen Huang, Tong Zhang
On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization.
ENMLP 2024 Findings.
-
Haoxiang Wang*, Yong Lin*, Wei Xiong*, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards.
ACL 2024.
-
Hanning Zhang*, Shizhe Diao*, Yong Lin*, Yi R. Fung, Qing Lian, Xingyao Wang, Yangyi Chen, Heng Ji, Tong Zhang.
R-tuning: Teaching large language models to refuse unknown questions.
NAACL 2024 [Outstanding Paper Award, 6/2434 = 0.25%] .
-
Yong Lin*, Lu Tan*, Yifan Hao*, Honam Wong, Hanze Dong, Weizhong Zhang, Yujiu Yang, Tong Zhang.
Spurious Feature Diversification Improves Out-of-distribution Generalization.
ICLR 2024.
-
Damien Teney, Yong Lin, Seong Joon Oh, Ehsan Abbasnejad.
Id and ood performance are sometimes inversely correlated on real-world datasets.
NeurIPS 2023 [Spotlight].
-
Rui Yang, Yong Lin, Xiaoteng Ma, Hao Hu, Chongjie Zhang, Tong Zhang.
What Is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?
ICML 2023
-
Yong Lin*, Renjie Pi*, Weizhong Zhang, Xiaobo Xia, Jiahui Gao, Xiao Zhou, Tongliang Liu, Bo Han.
A Holistic View of Noise Transition Matrix in Deep Learning and Beyond?
ICLR 2023 [Spotlight].
-
Yong Lin, Shengyu Zhu, Lu Tan, Peng Cui.
ZIN: When and How to Learn Invariance by Environment Inference?
NeurIPS 2022 [Spotlight].
-
Yong Lin*, Hanze Dong*, Hao Wang, Tong Zhang.
Bayesian Invariant Risk Minimization
CVPR 2022 [Oral].
-
Xiao Zhou*, Yong Lin*, Weizhong Zhang*, Tong Zhang.
Sparse Invariant Risk Minimization.
ICML 2022.
-
Xiao Zhou*, Yong Lin*, Renjie Pi*, Weizhong Zhang, Renzhe Xu, Peng Cui, Tong Zhang.
Model Agnostic Sample Reweighting for Out-of-Distribution Learning.
ICML 2022.
-
Yong Lin*, Qing Lian* and Tong Zhang.
An Empirical Study of Invariant Risk Minimization on Deep Models.
ICML2021 workshop on UDL.
-
Yong Lin, Zheng Xu.
Cable sheath loss reduction strategy research based on the coupled linemodel.
IEEE Transactions On Power Delivery.
Selected Awards
-
Outstanding Paper Award of NAACL 2024
-
2023 Apple Scholars in AI/ML PhD fellowship (22 awardees all over the world).
-
Outstanding Graduate of Zhejiang Province, 2013.
Experiences
-
Princeton University , Postdoc Fellow, 2024.9-now.
-
The Hong Kong University of Science and Technology , PhD Student, 2020 - 2024.
-
Alibaba, Senior Machine Learning Engineer, 2017 - 2020.
-
Zhejiang University , Bachelor and Master Student (Ranking 1/207), 2009 - 2016.