Efficient Temporal Extrapolation of Multimodal Large Language Models with Temporal Grounding Bridge Yuxuan Wang, Yueqian Wang, Pengfei Wu, Jianxin Liang, Dongyan Zhao, Yang Liu, Zilong Zheng* EMNLP 2024 下载 查看更多
Varying Sentence Representations via Condition-Specified Routers ZiyongLin*, QuansenWang, ZixiaJia†, ZilongZheng† EMNLP 2024 下载 查看更多
Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL Fangwei Zhong*†, Kui Wu, Hai Ci, Churan Wang, Hao Chen ECCV 2024 下载 查看更多
SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields Yu Liu*, Baoxiong Jia*, Yixin Chen, and Siyuan Huang† ECCV 2024 下载 查看更多
F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions Jie Yang⋆ , Xuesong Niu⋆, Nan Jiang⋆, Ruimao Zhang†, and Siyuan Huang† ECCV 2024 下载 查看更多
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding Yue Fan*, Xiaojian Ma*†, Rujie Wu, Yuntao Du, Jiaqi Li, Zhi Gao, Qing Li ECCV 2024 下载 查看更多