Publications

Conference Papers:

*Equal Contribution. Corresponding Author.

  1. H. Lu*, Y. Wen*, P. Cheng, R. Ding, J. Guo, H. Xu, C. Wang, H. Chen, X. Jiang, G. Jiang, Search Self-play: Pushing the Frontier of Agent Capability without Supervision, International Conference on Learning Representations (ICLR), 2026

  2. Z. Li, P. Cheng, Z. Yu, F. Tong, A. Gao, T. Chang, X. Wan, E. Zhao, X. Jiang, G. Jiang, Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance, International Conference on Learning Representations (ICLR), 2026

  3. Y. Du*, Z. Li*, Pengyu Cheng*, Z. Chen, Y. Xie, X. Wan, and A. Gao. Simplify rlhf as reward-weighted sft: A variational method, Transactions on Machine Learning Research (TMLR), 2026

  4. Y. Du*, Z. Li*, P. Cheng, X. Wan, A. Gao, Atoxia: Red-teaming Large Language Models with Target Toxic Answers , Findings of Annual Conference of the Nations of the Americas Chapter of the ACL (NAACL), 2025

  5. P. Cheng, T. Hu, H. Xu, Z. Zhang, Y. Dai, L. Han, N. Du, X. Li, Self-playing Adversarial Language Game Enhances LLM Reasoning, Neural Information Processing Systems (NeurIPS), 2024

  6. D. Zeng*, Y. Dai*, P. Cheng*, T. Hu, W. Chen, N. Du, Z. Xu, On Diversified Preferences of Large Language Model Alignment, Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

  7. P. Cheng*, Y. Yang*, J. Li*, Y. Dai, T. Hu, P. Cao, N. Du, X. Li, Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Games, Findings of the Association for Computational Linguistics (ACL), 2024

  8. J. Xie, P. Cheng, X. Liang, Y. Dai, and N. Du, Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers, Annual Meeting of the Association for Computational Linguistics (ACL), 2024

  9. K. Bai*, P. Cheng*, W. Hao, R. Henao, and L. Carin, Estimating Total Correlation with Mutual Information Estimators, Artificial Intelligence and Statistics Conference (AISTATS), 2023

  10. R. Wang*, P.cheng*, R. Henao, Mitigating Gender Bias for Text Generation via Mutual Information Minimization, Artificial Intelligence and Statistics Conference (AISTATS), 2023

  11. S. Luo, P. Cheng, S. Yu, Semi-constraint Optimal Transport for Entity Alignment with Dangling Cases , Findings of the Association for Computational Linguistics (ACL), 2022

  12. P. Cheng*, W. Hao*, S. Yuan, S. Si, L. Carin, FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders, International Conference on Learning Representations (ICLR), 2021

  13. S. Yuan*, P. Cheng*, R. Zhang, W. Hao, Z. Gan, and L. Carin, Improving Zero-Shot Voice Style Transfer via Disentangled Representation Learning, International Conference on Learning Representations (ICLR), 2021

  14. P. Cheng, W. Hao, S. Dai, J. Liu, Z. Gan, and L. Carin, CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information, International Conference on Machine Learning (ICML), 2020

  15. P. Cheng, M. Min, D. Shen, C. Malon, Y. Zhang, Y. Li, and L. Carin, Improving Disentangled Text Representation Learning with Information Theoretical Guidance, Annual Meeting of the Association for Computational Linguistics (ACL), 2020

  16. P. Cheng, Y. Li, X. Zhang, L. Chen, D. Carlson, and L. Carin, Dynamic Embedding on Textual Networks via a Gaussian Process, American Association of Artificial Intelligence (AAAI), 2020 Oral

  17. P. Cheng*, D. Shen*, D. Sundararaman, X. Zhang, A. Celikyilmaz, and L. Carin, Learning Compressed Sentence Representations for On-Device Text Processing, Annual Meeting of the Association for Computational Linguistics (ACL), 2019 Oral

  18. L. Chen, G. Wang, C. Tao, D. Shen, P. Cheng, X. Zhang, W. Wang, Y. Zhang, and L. Carin, Improving Textual Network Embedding with Global Attention via Optimal Transport, Annual Meeting of the Association for Computational Linguistics (ACL), 2019

  19. C. Liu, J. Zhuo, P. Cheng, R. Zhang, J. Zhu, and L. Carin, Understand and Accelerate Particle-based Variational Inference, International Conference on Machine Learning (ICML), 2019

Technical Reports:

  1. Kimi Team, Kimi-VL Technical Report, 2025

  2. P. Cheng, J. Xie, K. Bai, Y. Dai, and N. Du, Everyone Deserves A Reward: Learning Customized Human Preferences, 2023

  3. P. Cheng, R. Li, Replacing Language Model for Style Transfer, 2022

Workshop Papers:

  1. P. Cheng, W. Hao, and L. Carin, Estimating Total Correlation with Mutual Information Bounds, Neural Information Processing Systems (NeurIPS) Workshop, 2020

  2. P. Cheng, Y. Li, X. Zhang, L. Chen, D. Carlson, and L. Carin, Gaussian-Process-Based Dynamic Embedding for Textual Networks, Neural Information Processing Systems (NeurIPS) Workshop, 2019

  3. P. Cheng, C. Liu, C. Li, D. Shen, R. Henao, and L. Carin, Straight-Through Estimator as Projected Wasserstein Gradient Flow, Neural Information Processing Systems (NeurIPS) Workshop, 2018 Spotlight