Publications

ActivePose: Active 6D Object Pose Estimation and Tracking for Robotic Manipulation

Published in arXiv (CoRR), 2025

An active perception pipeline for 6-DoF object pose estimation and tracking that combines geometry-aware VLM prompting, NBV planning, and an equivariant diffusion policy for active tracking.

Recommended citation: Sheng Liu, Zhe Li, Weiheng Wang, Han Sun, Heng Zhang, Hongpeng Chen, Yusen Qin, Arash Ajoudani, Yizhao Wang. (2025). "ActivePose: Active 6D Object Pose Estimation and Tracking for Robotic Manipulation." arXiv preprint arXiv:2509.11364. https://arxiv.org/pdf/2509.11364

RoboBERT: An End-to-end Multimodal Robotic Manipulation Model

Published in arXiv (v2, cs.RO / cs.LG), 2025

An end-to-end multimodal manipulation model with a two-stage training paradigm: stable policy learning first, then rapid alignment to diverse natural-language variants, plus systematic visual data augmentations for robustness.

Recommended citation: Sicheng Wang, Sheng Liu, Weiheng Wang, Jianhua Shan, Bin Fang. (2025). "RoboBERT: An End-to-end Multimodal Robotic Manipulation Model." arXiv preprint arXiv:2502.07837v2. doi:10.48550/arXiv.2502.07837. https://arxiv.org/pdf/2502.07837v2