About Me
I am a Research Scientist at Meta Reality Labs. I obtained Ph.D. degree (2019-2024) from the Department of Computer Science and Engineering (CSE), Michigan State University (MSU), working with Prof. Yu Kong. Prior to MSU, I spent three Ph.D. years (2019-2022) at the B. Thomas Golisano College of Computing and Information Sciences (GCCIS) of Rochester Institute of Technology (R.I.T), working with Prof. Yu Kong and Prof. Qi Yu. I received my Master's degree (2016-2019) and Bachelor's degree (2012-2016) at the School of Remote Sensing and Information Engineering, Wuhan University (WHU) where I was advised by Prof. Daiqin Yang and Prof. Zhenzhong Chen at the Lab. of Intelligent Information Processing (IIP). I have research internship collaborations with excellent industrial researchers from Apple, OPPO US Research Center, and NEC Lab America.
I develop AI to understand the open visual world. I am broadly interested in real-world computer vision challenges including the visual recognition, prediction, understanding, and reasoning. My Ph.D. research covers open-set recognition, video understanding, vision-language modeling and 3D vision. Recently, I am particularly interested in multi-modal LLM and generative AI for vision applications.
[🔥🔥🔥] We are actively looking for prospective Ph.D. research interns for top-tier publication, working on biometric authentication and generative AI. If you are interested and have relevant publications on topics of face recognition, Re-ID, deep metric learning, domain adaptation, diffusion-based image generation, etc., please send your CV to me.
News []
- 2024.10: 🎉🎉🎉 One paper is accepted by WACV 2025.
- 2024.07: I joined Meta Reality Labs as a research scientist.
- 2024.07: I successfuly passed the Ph.D. dissertation defense at CSE Department of MSU.
- 2024.07: Three papers are accepted by ECCV 2024 (two co-authored)!
- 2024.03: I am selected to present in CVPR 2024 Doctoral Consortium and chat with Prof. Jason Corso.
- 2024.02: I successfuly passed the MSU PhD Comprehensive Exam, being a Ph.D. candidate!
- 2023.07: One paper is accepted by ICCV 2023.
- 2023.05: I am invited to deliver a talk on open-set recognition at the the 2nd MSU-ND workshop.
- 2023.02: I will be a research intern at NEC Laboratories America, Inc. (Princeton, NJ) in this summer.
- 2023.02: One co-authored paper is accepted by CVPR 2023.
- 2022.08: I started my second journey of Ph.D. study at the CSE department at MSU!
- 2022.07: One co-authored paper is accepted by ECCV 2022.
- 2022.06: Start my internship at OPPO U.S. Research Center at Palo Alto, CA. (on-site)
- 2022.05: I attended the conference ICRA 2022 on-site at Philadelphia, PA.
- 2022.04: I received the CVPR 2022 Travel Award to attend the conference at New Orleans, LA.
- 2022.03: One paper is accepted by CVPR 2022 for Oral presentation!
- 2021.10: One co-authored paper is accepted by BMVC 2021.
- 2021.07: Two papers are accepted by ICCV 2021, with one paper for Oral presentation!
- 2021.06: Start my internship at Apple Inc., 3D Vision Team at Apple Maps. (remote)
- 2021.04: One co-authored paper is accepted by IJCNN 2021.
- 2020.07: Two papers are accepted by ACM MM 2020 (one co-authored).
- 2020.07: One co-authored paper is accepted by ECCV 2020.
- 2020.06: One paper is accepted by IROS 2020.
- 2020.06: One co-authored paper is accepted by ICPR 2020.
- 2020.05: I passed the Ph.D. Research Potential Assessment!
- 2019.08: Start my new journey at RIT, Rochester, NY.
|
Selected Publications
Conferences
Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
Wentao Bao, Kai Li, Yuxiao Chen, Deep Patel, Martin Renqiang Min, Yu Kong
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025
Code
BibTeX
Prompting Language-Informed Distribution for Compositional Zero-Shot Learning
Wentao Bao, Lichang Chen, Heng Huang, Yu Kong
European Conference on Computer Vision (ECCV), 2024
arXiv
Code
BibTeX
Facial Affective Behavior Analysis with Instruction Tuning
Yifan Li, Anh Dao,
Wentao Bao, Zhen Tan, Tianlong Chen, Huan Liu, Yu Kong
European Conference on Computer Vision (ECCV), 2024
arXiv
Project
BibTeX
Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Yuxiao Chen, Kai Li,
Wentao Bao, Deep Patel, Yu Kong, Martin Renqiang Min, Dimitris N. Metaxas
European Conference on Computer Vision (ECCV), 2024
arXiv
BibTeX
Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting
Wentao Bao, Lele Chen, Libing Zeng, Zhong Li, Yi Xu, Junsong Yuan, Yu Kong
International Conference on Computer Vision (ICCV), 2023
PDF
Code
Project
arXiv
BibTeX
3D-aware Facial Landmark Detection via Multiview Consistent Training on Synthetic Data
Libing Zeng, Lele Chen,
Wentao Bao, Zhong Li, Yi Xu, Junsong Yuan, Nima Khademi Kalantari
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
PDF
Code
Project
BibTeX
Towards Open Set Video Anomaly Detection
Yuansheng Zhu,
Wentao Bao, Qi Yu
European Conference on Computer Vision (ECCV), 2022
PDF
arXiv
Code
BibTeX
OpenTAL: Towards Open Set Temporal Action Localization
Wentao Bao, Qi Yu, Yu Kong
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022 (
Oral)
PDF
arXiv
Poster
Code
BibTeX
Gradient Frequency Modulation for Visually Explaining Video Understanding Models
Xinmiao Lin,
Wentao Bao, Matthew Wright, Yu Kong
British Machine Vision Conference (BMVC), 2021
arXiv
PDF
BibTeX
Evidential Deep Learning for Open Set Action Recognition
Wentao Bao, Qi Yu, Yu Kong
International Conference on Computer Vision (ICCV), 2021 (
Oral)
PDF
arXiv
Poster
Code
BibTeX
DRIVE: Deep Reinforced Accident Anticipation with Visual Explanation
Wentao Bao, Qi Yu, Yu Kong
International Conference on Computer Vision (ICCV), 2021
PDF
arXiv
Poster
Code
BibTeX
Multiple Instance Relational Learning for Video Anomaly Detection
Xiwen Dengxiong,
Wentao Bao, Yu Kong
International Joint Conference on Neural Network (IJCNN), 2021
DOI
BibTeX
Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning
Wentao Bao, Qi Yu, Yu Kong
The 28th ACM International Conference on Multimedia (MM), 2020
arXiv
DOI
Code
Dataset
BibTeX
Activity-driven Weakly-Supervised Spatio-Temporal Grounding from Untrimmed Videos
Junwen Chen,
Wentao Bao, Yu Kong
The 28th ACM International Conference on Multimedia (MM), 2020
DOI
BibTeX
Group Activity Prediction with Sequential Relational Anticipation Model
Junwen Chen,
Wentao Bao, Yu Kong
European Conference on Computer Vision (ECCV), 2020
PDF
arXiv
Code
BibTeX
Object-Aware Centroid Voting for Monocular 3D Object Detection
Wentao Bao, Qi Yu, Yu Kong
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020
PDF
arXiv
Demo
BibTeX
Privacy Attributes-aware Message Passing Neural Network for Visual Privacy Attributes Classification
Hanbin Hong,
Wentao Bao, Yuan Hong, Yu Kong
International Conference on Pattern Recognition (ICPR), 2020
DOI
BibTeX
Journals
Human Scanpath Prediction based on Deep Convolutional Saccadic Model
Wentao Bao, Zhenzhong Chen
Elsevier Journal of Neurocomputing (Neurocomputing), 2020
DOI
BibTeX
MonoFENet: Monocular 3D Object Detection with Feature Enhancement Networks
Wentao Bao, Bin Xu, Zhenzhong Chen
IEEE Transactions on Image Processing (TIP), 2019
DOI
BibTeX
Group Lasso based Band Selection for Hyperspectral Image Classification
Daiqin Yang,
Wentao Bao
IEEE Geoscience and Remote Sensing Letters (GRSL), 2017
DOI
Code
Dataset
BibTeX
Preprints
MADiff: Motion-Aware Mamba Diffusion Models for Hand Trajectory Prediction on Egocentric Videos
Junyi Ma, Xieyuanli Chen,
Wentao Bao, Jingyi Xu, Hesheng Wang
Preprint, 2024
arXiv
Code
Project
BibTeX
Latent Space Energy-based Model for Fine-grained Open Set Recognition
Wentao Bao, Qi Yu, Yu Kong
Preprint, 2023
arXiv
BibTeX
On Model Explanations with Transferable Neural Pathways
Xinmiao Lin,
Wentao Bao, Qi Yu, Yu Kong
Preprint, 2023
arXiv
BibTeX
Selected Awards & Honors
Awards
- CVPR 2024 Travel Award for presentation in CVPR'24 Doctoral Consortium, Seattle, USA, 2024.
- CVPR 2022 Travel Award for in-person conference at New Orleans, USA, 2022.
- AAAI 2020 Travel Award for in-person conference at New York, USA, 2020.
- Postgraduate Academic Innovation Award from Wuhan University, 2020.
- Grand Prize Winner, ICME 2018 Grand Challenge on Salient360! Visual Attention Modeling for 360 Content, 2018.
- Bronze Award in Hubei Province, The 2nd China College Students "Internet Plus" Innovation and Entrepreneurship Competition. 2016.
- Second-Class Prize, The 3rd National Graduate Contest on Smart-City Technology and Creative Design, Abnormal Event Detection. 2016.
- First Prize, IEEE BigMM 2015 Challenge: "Large-Scale Object Tracking over a Multiple-Camera Network". 2015.
- Third-Class Prize, The 14th "Challenge Cup" National Undergraduate Curricular Academic Science and Technology Contest on "Smart City". 2015.
- Meritorious Winner, Mathematical Contest in Modeling (MCM). 2015.
- Second prize, The 12th "SuperMap Cup" National Undergraduate GIS Contest, Android Application Development. 2014.
Honors
- Excellent Graduated Student, Wuhan University, 2019. (top 5%)
- China National Scholarship, 2018.
- The First-class Academic Scholarship, Wuhan University, 2017 & 2018. (top 10%)
- Outstanding Postgraduate Student, Wuhan University, 2017 & 2018. (top 10%)
- Excellent Graduate Freshman Scholarship of Wuhan University, 2016. (top 10%)
- Advanced Individual, Wuhan University, 2016.
Academic Services
Conference Reviewer
- IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): 2021, 2022, 2023, 2024
- IEEE/CVF International Conference on Computer Vision (ICCV): 2021
- European Conference on Computer Vision (ECCV): 2022, 2024
- International Joint Conference on Artificial Intelligence (IJCAI): 2023
- ACM International Conference on Multimedia (ACM MM): 2019, 2020, 2021, 2022, 2023
- IEEE/CVF Winter Conference on Applications of Computer Vision (WACV): 2023, 2024, 2025
- IEEE International Conference on Multimedia and Expo (ICME): 2024
- IEEE International Conference on Robotics and Automation (ICRA): 2021
- IEEE International Conference on Web Services (ICWS): 2021
- IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI): 2020
Journal Reviewer
Membership
- IEEE Student Member.
- ACM Student Member.
Volunteer
Teaching
- Teaching Assistant, MSU CSE-402: Biometrics and Pattern Recognition, FS2023.
- Teaching Activities (DRL Intro.), RIT CSCI-631: Foundations of Computer Vision, SS2021 & SS2022.
- Teaching Activities (EDL Intro.), RIT CISC-849: PhD Seminar, FS2021.
Academic Talks
- 2021.08.20 & 2021.09.16: Delivering two academic talks in Chinese media Jishi and TechBeat, introducing our recent ICCV Oral paper.
- 2020.11.17: Delivering an academic talk in the 2020 RIT Graduate Virtual Showcase: A Vision Into the Future.