Hi, I am Zihui Xue (薛子慧), and I usually go by Sherry. I am 4th-year Ph.D. candidate at UT Austin, advised by Prof. Kristen Grauman.
My research focuses on developing methods to better understand and structure video content for instructional videos.
Recent Projects
Progress-Aware Video Frame Captioning
Zihui Xue,
Joungbin An,
Xitong Yang,
Kristen Grauman
CVPR, 2025[paper][webpage]
HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
Zihui Xue,
Mi Luo,
Changan Chen,
Kristen Grauman
NeurIPS, 2024[paper][webpage]
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen*,
Puyuan Peng*,
Ami Baid,
Zihui Xue,
Wei-Ning Hsu, David Harwath, Kristen Grauman
ECCV, 2024 (Oral) [paper][webpage]
Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos
Mi Luo,
Zihui Xue,
Alex Dimakis,
Kristen Grauman
ECCV, 2024 [paper]
Learning Object State Changes in Videos: An Open-World Perspective
Zihui Xue,
Kumar Ashutosh,
Kristen Grauman
CVPR, 2024[paper][webpage]
Ego-Exo4D: Understanding Skilled Human Activity from First-and Third-Person Perspectives
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, ..., Zihui Xue, et al.
CVPR, 2024 (Oral) [paper][webpage] [blog]
Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment
Zihui Xue,
Kristen Grauman
NeurIPS, 2023[paper][webpage] Fine-grained ego-exo view-invariant features -> temporally align two videos from diverse viewpoints
Egocentric Video Task Translation
Zihui Xue,
Yale Song,
Kristen Grauman,
Lorenzo Torresani
CVPR 2023 (Hightlight)[paper][webpage] Hollistic egocentric perception for a set of diverse video tasks