Fu-Yun Wang (Pronounced as "Foo-Yoon Wahng" IPA: [fu˧˥ yn˧˥ wɑŋ]) is a second-year Ph.D. Candidate of MMLab@CUHK.
My research interests now focus on scalable post-training techniques for diffusion models and unified multimodal models.
I plan to enter the job market in 2027 and am open to overseas opportunities in industrial generative AI jobs, and postdoctoral roles. Feel free to contact me early to discuss potential collaborations.
Below is an interactive tree diagram categorizing my research work by direction. Click nodes to expand/collapse, and click paper titles to visit links.
Research Intern | 2022.6 - 2022.12
Worked on Class-Incremental Learning.
Supervised by: Dr. Liu Liu
Collaborated with Dr. Yatao Bian for instruction and discussions
Research Collaboration | 2023.10 - 2024.10
Worked on Video Diffusion Models, Diffusion Distillation.
Supervised by: Dr. Zhaoyang Huang
Collaborated with Dr. Xiaoyu Shi and Weikang Bian for instruction and discussions
Research Intern | 2025.2 - 2025.5
Focused on Diffusion Distillation, Reinforcement Learning.
Supervised by: Dr. Long Zhao, Dr. Ting Liu, Dr. Hao Zhou and Dr. LiangZhe Yuan
Collaborated with Prof. Bohyung Han, Prof. Boqing Gong for instruction and discussions.
Research Intern | 2025.6 - Present
Focused on Multimodal Language Models, Diffusion Moldes, Reinforcement Learning.
Supervised by: Dr. Han Zhang
Ph.D. in Engineering | 2023 - Present
Supervisor: Professor Hongsheng Li and Professor Xiaogang Wang
B.Eng. in Artificial Intelligence (RANK 2/88) | 2019 - 2023
Supervisor: Professor Han-Jia Ye and Professor Da-Wei Zhou (LAMDA Group)
Below are some of my selected publications, categorized by theme. For a complete list, please visit my Google Scholar profile.
Fu-Yun Wang,
Ling Yang,
Zhaoyang Huang,
Mengdi Wang,
Hongsheng Li
Thirteenth International Conference on Learning Representations. ICLR 2025.
We conducted an in-depth and meticulous theoretical analysis and empirical validation of flow matching, rectified flow, and the rectification operation. We demonstrated that the rectification operation is also applicable to general diffusion models, and that flow matching is fundamentally no different from the traditional noise addition methods in DDPM. Our related blog post on ZHIHU garnered over 10k views and approximately 400 likes.
arXiv •
GitHub •
Poster
Fu-Yun Wang,
Zhaoyang Huang,
Alexander William Bergman,
Dazhong Shen,
Peng Gao,
Michael Lingelbach,
Keqiang Sun,
Weikang Bian,
Guanglu Song,
Yu Liu,
Xiaogang Wang,
Hongsheng Li
Conference on Neural Information Processing Systems. NeurIPS 2024.
We validated and enhanced the effectiveness of consistency models for text-to-image and text-to-video generation. Our method has been adopted by the FastVideo project, successfully accelerating SoTA video diffusion models including HunyuanVideo and WAN.
Project Page •
Github •
Paper •
Poster
Fu-Yun Wang,
Yunhao Shui,
Jingtan Piao,
Keqiang Sun,
Hongsheng Li
Thirteenth International Conference on Learning Representations. ICLR 2025.
We proposed a general, simple yet effective method for strengthened diffusion preference optimization, improving the alignment of generated outputs with user preferences.
Paper •
Poster •
Github
Xiaoyu Shi*,
Zhaoyang Huang*,
Fu-Yun Wang*,
Weikang Bian*,
Dasong Li,
Yi Zhang,
Manyuan Zhang,
Kachun Cheung,
Simon See,
Hongwei Qin,
Jifeng Dai,
Hongsheng Li
Special Interest Group on GRAPHics and Interactive Techniques.SIGGRAPH 2024.
SIGGRAPH 2024 Technical Papers Trailer
Project Page •
GitHub •
arXiv
Fu-Yun Wang,
Zhaoyang Huang,
Qiang Ma,
Xudong Lu,
Weikang Bian,
Yijin Li,
Yu Liu,
Hongsheng Li
European Conference on Computer Vision. ECCV 2024.
ECCV 2024 Oral Presentation
Project Page •
Paper
Da-Wei Zhou*,
Fu-Yun Wang*,
Han-Jia Ye,
De-Chuan Zhan
SCIENCE CHINA Information Sciences. SCIS.
PyCIL stands out as a comprehensive and user-friendly Python toolbox for Class-Incremental Learning. Boasting nearly 1000 stars on GitHub, it is currently the most widely collected CIL toolkit, adopted by researchers worldwide. It provides a standardized framework for implementing and evaluating various CIL algorithms, fostering reproducible research and accelerating advancements in the field.
Github •
arXiv •
Media •
Fu-Yun Wang,
Da-Wei Zhou,
Han-Jia Ye,
De-Chuan Zhan
European Conference on Computer Vision. ECCV 2022.
FOSTER introduces a novel approach to Class-Incremental Learning by combining feature boosting and compression strategies. This method effectively mitigates catastrophic forgetting while promoting the learning of new classes, showcasing robust performance in dynamic learning environments.
Github •
arXiv •
Journal/Conference | Years |
---|---|
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) | - |
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) | - |
Pattern Recognition Letters (PRL) | - |
Conference on Computer Vision and Pattern Recognition (CVPR) | 2023, 2024, 2025 |
Neural Information Processing Systems (NeurIPS) | 2023, 2025 |
International Conference on Learning Representations (ICLR) | 2024, 2025 |
International Conference on Machine Learning (ICML) | 2024, 2025 |
European Conference on Computer Vision (ECCV) | 2024 |
International Conference on Computer Vision (ICCV) | 2025 |
British Machine Vision Conference (BMVC) | 2024 |
SIGGRAPH Asia | 2025 |