📝 Publications

arXiv
pi_rl

πRL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models

Kang Chen, Zhihao Liu, Tonghe Zhang, Zhen Guo, Si Xu, Hao Lin, Hongzhi Zang, Quanlu Zhang, Zhaofei Yu, Guoliang Fan, Tiejun Huang, Yu Wang, Chao Yu

[arXiv] [Code]

  • We introduce πRL, the first open-source framework for efficient RL fine-tuning with flow-based VLAs.
CVPR 2025 - Highlight
sym

USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting

Kang Chen, Jiyuan Zhang, Zecheng Hao, Yajing Zheng, Tiejun Huang and Zhaofei Yu

[arXiv] [Code]

  • We demonstrate that Spike-to-Image and 3D reconstruction tasks can mutually facilitate and enhance the optimization of each other.
NeurIPS 2024 - Spotlight
sym

SpikeReveal: Unlocking Temporal Sequences from Real Blurry Inputs with Spike Streams

Kang Chen, Shiyan Chen, Jiyuan Zhang, Baoyue Zhang, Yajing Zheng, Tiejun Huang and Zhaofei Yu

[arXiv] [Code]

  • We develop a self-supervised spike-guided image deblurring framework, addressing the performance degradation due to the synthetic-real domain gap in supervised methods.
  • We perform an in-depth theoretical analysis of the fusion between the spike stream and blurry image, leading to the development of the SDM.
AAAI 2025
sym

Rethinking High-speed Image Reconstruction Framework with Spike Camera

Kang Chen, Yajing Zheng, Tiejun Huang and Zhaofei Yu

[arXiv] [Code]

  • We introduce a novel spike-based image reconstruction framework, which leverages the CLIP model to supervise the network training by the class label of the captured object and the features of high-quality images.
  • We design a high-quality image generation pipeline and demonstrate that a lightweight reconstruction network is sufficient for the spike-to-image task when the supervision signal is weak.
TMM 2024
sym

Motion Deblur by Learning Residual from Events

Kang Chen and Lei Yu

[paper] [Code]

  • We propose a Two-Stage Residual-based Motion Deblurring (TRMD) framework for event cameras, which utilizes the residual sequence as the intermediate variable, providing a stronger supervision signal for network training.