Currently I work as a staff research scientist with SenseTime Research.
At SenseTime I've worked on SensePhoto, the state-of-the-art mobile photography solution dilivered to major smartphone OEMs. I earned my PhD at Beihang University, advised by Prof. Qinping Zhao and Prof. Bin Zhou. I also work as a visiting researcher at CVTEAM led by Prof. Jia Li. I did postdoc from 2019 to 2021, advised by Prof. Xiaogang Wang and Yebin Liu.
I study computer vision, event-based vision, machine learning and optimization. My research lies much in image and video processing with learning and optimization methods.
(†interns/students   *corresponding author)
Deep Bayesian Video Frame Interpolation
Zhiyang Yu†,
Yu Zhang*,
Xujie Xiang,
Dongqing Zou,
Xijun Chen,
Jimmy S. Ren ECCV, 2022
paper
/
code
By encoding the VFI prior into a few unfolded, learned gradient descent steps under the Bayesian regularization framework, our new VFI model achieves state-of-the-art results with only half the parameters of existing models, while showing better generalizability.
From Pose to Part: Weakly-Supervised Pose Evolution for Human Part Segmentation
Yifan Zhao,
Jia Li,
Yu, Zhang,
Yonghong Tian IEEE TPAMI, 2022  
paper
/
code
Human part segmentation can be conducted without dense pixel-level annotations by evolving a coarse part class map with image boundary cues, constrained by pose and object-level annotations.
Training Weakly Supervised Video Frame Interpolation with Events
Zhiyang Yu†,
Yu Zhang*,
Deyuan Liu,
Dongqing Zou,
Xijun Chen,
Yebin Liu.
Jimmy S. Ren ICCV, 2021
paper/
code
Using an event camera allows you to train video interpolation models without the need of high frame-rate videos.
How to Learn a Domain Adaptive Event Simulator?
Daxin Gu†,
Jia Li,
Yu, Zhang*,
Yonghong Tian ACM MM, 2021   (Oral Presentation) paper
/
code
A fully trainable white-box event camera simulator with divide-and-conquer domain adaptation that automatically calibrate its parameters towards target domain.
Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection
Luwei Hou†,
Yu, Zhang*,
Kui Fu,
Jia Li CVPR, 2021   (Oral Presentation) paper
/
supp.
Cross-domain pixel-level correspondences can be learned in weakly supervised manner for object detector adaptation.
A dataset for aerial saliency detection and how to adapt existing saliency models to this task.
Efficient Low-resolution Face Recognition via Bridge Ditillation Shiming Ge,
Shengwei Zhao,
Chenyu Li,
Yu Zhang,
Jia Li TIP, 2020
paper
Simply learning feature super-resolution and knowledge distillation in multi-task way produces an accurate face detector capable of processing 763 faces/s on mobile phone.
Multiscale adversarial training on feature correlations defines unsupervised structural preservation loss for novel view synthesize.
Real-time 3D Scene Reconstruction with Dynamically Moving Object using a Single Depth Camera Feixiang Lu,
Bin Zhou,
Yu Zhang,
Qinping Zhao TVC, 2018   (Best Paper Award of CGI 2018) paper
By improving reference frame selection and 6D pose prediction, we reconstruct dynamic objects in real-time while handling large motion.
Exploring Weakly Labeled Images for Video Object Segmentation with Submodular Proposal Selection Yu Zhang,
Xiaowu Chen,
Jia Li,
Wei Teng,
Haokun Song
TIP, 2018
paper /
results
Modeling object part relations with simple priors enables accurate object localization in videos with weak supervision.
Exemplar detectors are explored to learn instance-specific saliency patterns and a large saliency detection dataset is proposed.
Semantic Object Segmentation in Tagged Videos via Detection Yu Zhang,
Xiaowu Chen,
Jia Li,
Chen Wang,
Changqun Xia,
Jun Li
TPAMI, 2017
paper
Extended version of CVPR 2015 with improved network flow solver and object shape prior.
6-DOF Image Localization from Massive Geo-tagged Reference Images Yafei Song,
Xiaowu Chen,
Xiaogang Wang,
Yu Zhang,
Jia Li TMM, 2016   (Best Paper Award of IEEE BigMM 2015) paper
Searching for the posed images with appearance similar to the input image can uniquely determine its pose with fast inference.
Local Shape Transfer for Image Co-segmentation
Wei Teng*,
Yu Zhang*,
Xiaowu Chen,
Jia Li,
Zhiqiang He
BMVC, 2016   (Oral Presentation) paper /
extended abstract
Shapes of local image patches lie in low-dimensional manifold, which is a consistency regularizer for image co-segmentation.
Cuboids detection in RGB-D images via maximum weighted clique
Han Zhang,
Xiaowu Chen,
Yu Zhang,
Jia Li,
Qing Li,
Xiaogang Wang
ICME, 2015
paper
By incorporating global layout consistency modelled with maximum weighted clique, previous detection rate of cuboid proposals in RGBD images is doubled.
Semantic Object Segmentation via
Detection in Weakly Labeled Video Yu Zhang,
Xiaowu Chen,
Jia Li,
Chen Wang,
Changqun Xia CVPR, 2015   (Oral Presentation) paper
Weak object detectors can generate strong video object segmentation results via joint inference with a quadratic network flow model.
Geodesic Propagation for Semantic Labeling
Qing Li,
Xiaowu Chen,
Yafei Song,
Yu Zhang TIP, 2014
paper
We present a fast approach for semantic segmentation by propagating labels along geodesic paths in feature space.
Former Interns
Yu Shi, Undergraduate student from Peking University, internship 2021-2022. Now a graduate student at UCLA.
Zhe Jiang, Master student from Sichuan University, intership 2018-2020. CVPR 2020 and ECCV 2020. Now a PhD student at The Hong Kong Polytechnic University.
Song Zhang, Master student from Beihang University, intership 2018-2020. ECCV 2020. Now a researcher at SenseTime.
Luwei Hou, Master student from Beihang University, intership 2020-2021. CVPR 2021. Now a researcher at SenseTime.
Much thanks to Jon Barron for sharing this template.