1 code implementation • ECCV 2020 • Zehao Yu, Lei Jin, Shenghua Gao
The task is extremely challenging because of the vast areas of non-texture regions in these scenes.
2 code implementations • 10 Apr 2024 • Jiale Xu, Weihao Cheng, Yiming Gao, Xintao Wang, Shenghua Gao, Ying Shan
We present InstantMesh, a feed-forward framework for instant 3D mesh generation from a single image, featuring state-of-the-art generation quality and significant training scalability.
no code implementations • 26 Mar 2024 • Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, Shenghua Gao
3D Gaussian Splatting (3DGS) has recently revolutionized radiance field reconstruction, achieving high quality novel view synthesis and fast rendering speed without baking.
no code implementations • 18 Mar 2024 • Yuting Xiao, Xuan Wang, Jiafei Li, Hongrui Cai, Yanbo Fan, Nan Xue, Minghui Yang, Yujun Shen, Shenghua Gao
To this end, we propose a novel approach, GauMesh, to bridge the 3D Gaussian and Mesh for modeling and rendering the dynamic scenes.
2 code implementations • 19 Jan 2024 • Chenyu Wang, Weixin Luo, Qianyu Chen, Haonan Mai, Jindi Guo, Sixun Dong, Xiaohua, Xuan, Zhengxin Li, Lin Ma, Shenghua Gao
Recently, the astonishing performance of large language models (LLMs) in natural language comprehension and generation tasks triggered lots of exploration of using them as central controllers to build agent systems.
1 code implementation • 25 Nov 2023 • Wenqiao Li, Xiaohao Xu, Yao Gu, Bozhong Zheng, Shenghua Gao, Yingna Wu
During testing, the point cloud repeatedly goes through the Mask Reconstruction Network, with each iteration's output becoming the next input.
1 code implementation • 6 Nov 2023 • Shuo Wang, Jing Li, Zibo Zhao, Dongze Lian, Binbin Huang, Xiaomei Wang, Zhengxin Li, Shenghua Gao
Holistic scene understanding includes semantic segmentation, surface normal estimation, object boundary detection, depth estimation, etc.
1 code implementation • 16 Oct 2023 • Yiqun Zhao, Zibo Zhao, Jing Li, Sixun Dong, Shenghua Gao
Indoor scene generation aims at creating shape-compatible, style-consistent furniture arrangements within a spatially reasonable layout.
1 code implementation • ICCV 2023 • YiHao Zhi, Xiaodong Cun, Xuelin Chen, Xi Shen, Wen Guo, Shaoli Huang, Shenghua Gao
While previous methods are able to generate speech rhythm-synchronized gestures, the semantic context of the speech is generally lacking in the gesticulations.
no code implementations • 29 Aug 2023 • Yuting Xiao, Jingwei Xu, Zehao Yu, Shenghua Gao
This paper presents \textbf{DebSDF} to address these challenges, focusing on the utilization of uncertainty in monocular priors and the bias in SDF-based volume rendering.
no code implementations • 24 Jul 2023 • Jiaben Chen, Yichen Zhu, Dongze Lian, Jiaqi Yang, Yifu Wang, Renrui Zhang, Xinhang Liu, Shenhan Qian, Laurent Kneip, Shenghua Gao
We therefore propose to incorporate RGB information in an event-guided optical flow refinement strategy.
1 code implementation • NeurIPS 2023 • Zibo Zhao, Wen Liu, Xin Chen, Xianfang Zeng, Rui Wang, Pei Cheng, Bin Fu, Tao Chen, Gang Yu, Shenghua Gao
We present a novel alignment-before-generation approach to tackle the challenging task of generating general 3D shapes based on 2D images or texts.
no code implementations • 12 Jun 2023 • Jiale Xu, Xintao Wang, Yan-Pei Cao, Weihao Cheng, Ying Shan, Shenghua Gao
Enhancing AI systems to perform tasks following human instructions can significantly boost productivity.
no code implementations • 21 Apr 2023 • Binbin Huang, Xingyue Peng, Siyuan Shen, Suan Xia, Ruiqian Li, Yanhua Yu, Yuehan Wang, Shenghua Gao, Wenzheng Chen, Shiying Li, Jingyi Yu
The core of our method is to put the object nearby diffuse walls and augment the LOS scan in the front view with the NLOS scans from the surrounding walls, which serve as virtual ``mirrors'' to trap lights toward the object.
1 code implementation • CVPR 2023 • Sixun Dong, Huazhang Hu, Dongze Lian, Weixin Luo, Yicheng Qian, Shenghua Gao
Sequential video understanding, as an emerging video understanding task, has driven lots of researchers' attention because of its goal-oriented nature.
no code implementations • 15 Mar 2023 • Ye Huang, Di Kang, Shenghua Gao, Wen Li, Lixin Duan
One crucial design of the HFG is to protect the high-level features from being contaminated by using proper stop-gradient operations so that the backbone does not update according to the noisy gradient from the upsampler.
no code implementations • 1 Mar 2023 • Jing Li, Jinpeng Yu, Ruoyu Wang, Zhengxin Li, Zhengyu Zhang, Lina Cao, Shenghua Gao
As the unsupervised plane segments are usually noisy and inaccurate, we propose to assign different weights to the sampled points on the plane in plane estimation as well as the regularization loss.
no code implementations • CVPR 2023 • Jiale Xu, Xintao Wang, Weihao Cheng, Yan-Pei Cao, Ying Shan, XiaoHu Qie, Shenghua Gao
Specifically, we first generate a high-quality 3D shape from the input text in the text-to-shape stage as a 3D shape prior.
no code implementations • 28 Dec 2022 • Hongye Zeng, Kang Zhou, Songhan Ge, Yuchong Gao, Jianhao Zhao, Shenghua Gao, Rui Zheng
We propose VertMatch, a two-step framework to detect vertebral structures in 3D ultrasound volume by utilizing unlabeled data in semi-supervised manner.
1 code implementation • 29 Nov 2022 • Chunlin Yu, Ye Shi, Zimo Liu, Shenghua Gao, Jingya Wang
Lifelong person re-identification (LReID) is in significant demand for real-world development as a large amount of ReID data is captured from diverse locations over time and cannot be accessed at once inherently.
no code implementations • 26 Nov 2022 • Yuting Xiao, Yiqun Zhao, Yanyu Xu, Shenghua Gao
In the first stage, we focus on geometry reconstruction based on SDF representation, which would lead to a good geometry surface of the scene and also a sharp density.
1 code implementation • CVPR 2023 • Ruoyu Wang, Zehao Yu, Shenghua Gao
PlaneDepth estimates the depth distribution using a Laplacian Mixture Model based on orthogonal planes for an input image.
1 code implementation • 31 Aug 2022 • YiHao Zhi, Shenhan Qian, Xinhao Yan, Shenghua Gao
Previous methods alleviate the inconsistency of lighting by learning a per-frame embedding, but this operation does not generalize to unseen poses.
1 code implementation • 20 Jul 2022 • Shenhan Qian, Jiale Xu, Ziwei Liu, Liqian Ma, Shenghua Gao
We propose united implicit functions (UNIF), a part-based method for clothed human reconstruction and animation with raw scans and skeletons as the input.
1 code implementation • 26 May 2022 • Binbin Huang, Xinhao Yan, Anpei Chen, Shenghua Gao, Jingyi Yu
We present an efficient frequency-based neural representation termed PREF: a shallow MLP augmented with a phasor volume that covers significant border spectra than previous Fourier feature mapping or Positional Encoding.
no code implementations • CVPR 2022 • Xianing Chen, Qiong Cao, Yujie Zhong, Jing Zhang, Shenghua Gao, DaCheng Tao
Our DearKD is a two-stage framework that first distills the inductive biases from the early intermediate layers of a CNN and then gives the transformer full play by training without distillation.
1 code implementation • CVPR 2022 • Huazhang Hu, Sixun Dong, Yiqun Zhao, Dongze Lian, Zhengxin Li, Shenghua Gao
Existing methods focus on performing repetitive action counting in short videos, which is tough for dealing with longer videos in more realistic scenarios.
Ranked #2 on Repetitive Action Counting on RepCount
no code implementations • 18 Jan 2022 • Yuting Xiao, Jiale Xu, Shenghua Gao
Taylor3DNet exploits a set of discrete landmark points and their corresponding Taylor series coefficients to represent the implicit field of a 3D shape, and the number of landmark points is independent of the resolution of the iso-surface extraction.
1 code implementation • CVPR 2022 • Yicheng Qian, Weixin Luo, Dongze Lian, Xu Tang, Peilin Zhao, Shenghua Gao
In this paper, we propose a novel sequence verification task that aims to distinguish positive video pairs performing the same action sequence from negative ones with step-level transformations but still conducting the same task.
no code implementations • 25 Oct 2021 • Wei Zhou, Xiangyu Zhang, Hongyu Wang, Shenghua Gao, Xin Lou
It is shown that by adding another transformation, the proposed method is able to synthesize high-quality RAW Bayer images with arbitrary size.
no code implementations • 22 Oct 2021 • Pingxuan Huang, Zhenhua Cui, Jing Li, Shenghua Gao, Bo Hu, Yanyan Fang
Further, considering the consistency between the observed and the predicted trajectories, a target domain offset discriminator is utilized to adversarially regularize the future trajectory predictions to be in line with the observed trajectories.
1 code implementation • 21 Oct 2021 • Yepeng Liu, Zaiwang Gu, Shenghua Gao, Dong Wang, Yusheng Zeng, Jun Cheng
Very often, the pose is estimated after the face detection.
no code implementations • 5 Oct 2021 • Kang Zhou, Jing Li, Weixin Luo, Zhengxin Li, Jianlong Yang, Huazhu Fu, Jun Cheng, Jiang Liu, Shenghua Gao
To mitigate this problem, in this paper, we propose a novel Proxy-bridged Image Reconstruction Network (ProxyAno) for anomaly detection in medical images.
no code implementations • 23 Sep 2021 • Xianing Chen, Chunlin Xu, Qiong Cao, Jialang Xu, Yujie Zhong, Jiale Xu, Zhengxin Li, Jingya Wang, Shenghua Gao
Transformers have shown preferable performance on many vision tasks.
1 code implementation • ICCV 2021 • Shenhan Qian, Zhi Tu, YiHao Zhi, Wen Liu, Shenghua Gao
Co-speech gesture generation is to synthesize a gesture sequence that not only looks real but also matches with the input speech audio.
2 code implementations • ICLR 2022 • Dongze Lian, Zehao Yu, Xing Sun, Shenghua Gao
Our proposed AS-MLP obtains 51. 5 mAP on the COCO validation set and 49. 5 MS mIoU on the ADE20K dataset, which is competitive compared to the transformer-based architectures.
Ranked #13 on Semantic Segmentation on DensePASS
no code implementations • CVPR 2021 • Zibo Zhao, Wen Liu, Yanyu Xu, Xianing Chen, Weixin Luo, Lei Jin, Bohui Zhu, Tong Liu, Binqiang Zhao, Shenghua Gao
One is a structure prior, it uses a human parsing map to represent the human body structure.
1 code implementation • CVPR 2021 • Binbin Huang, Dongze Lian, Weixin Luo, Shenghua Gao
Then we combine the contextual information from the landmark feature convolution module with the target's visual features for grounding.
1 code implementation • CVPR 2021 • Jiale Xu, Jia Zheng, Yanyu Xu, Rui Tang, Shenghua Gao
Then, we leverage the room layout prior, a strong structural constraint of the indoor scene, to guide the generation of target views.
1 code implementation • CVPR 2021 • Zhaoyuan Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanling Zhang, Shenghua Gao
This paper proposes a framework for the interactive video object segmentation (VOS) in the wild where users can choose some frames for annotations iteratively.
1 code implementation • ICCV 2021 • Yanyu Xu, Ziming Zhong, Dongze Lian, Jing Li, Zhengxin Li, Xinxing Xu, Shenghua Gao
To fully leverage the data captured from different scenes with different view angles while reducing the annotation cost, this paper studies a novel crowd counting setting, i. e. only using partial annotations in each image as training data.
1 code implementation • 10 Dec 2020 • Yuting Xiao, Yanyu Xu, Ziming Zhong, Weixin Luo, Jiawei Li, Shenghua Gao
In this way, features corresponding to background and occlusion can be suppressed for amodal mask estimation.
2 code implementations • 18 Nov 2020 • Wen Liu, Zhixin Piao, Zhi Tu, Wenhan Luo, Lin Ma, Shenghua Gao
Also, we build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis.
no code implementations • NeurIPS 2020 • Peiyao Wang, Weixin Luo, Yanyu Xu, Haojie Li, Shugong Xu, Jianyu Yang, Shenghua Gao
Spatial Description Resolution, as a language-guided localization task, is proposed for target location in a panoramic street view, given corresponding language descriptions.
1 code implementation • ECCV 2020 • Kang Zhou, Yuting Xiao, Jianlong Yang, Jun Cheng, Wen Liu, Weixin Luo, Zaiwang Gu, Jiang Liu, Shenghua Gao
In the end, we further utilize the reconstructed image to extract the structure and measure the difference between structure extracted from original and the reconstructed image.
1 code implementation • 15 Jul 2020 • Zehao Yu, Lei Jin, Shenghua Gao
Furthermore, because those textureless regions in indoor scenes (e. g., wall, floor, roof, \etc) usually correspond to planar regions, we propose to leverage superpixels as a plane prior.
1 code implementation • CVPR 2018 • Kun Huang, Yifan Wang, Zihan Zhou, Tianjiao Ding, Shenghua Gao, Yi Ma
To this end, we have built a very large new dataset of over 5, 000 images with wireframes thoroughly labelled by humans.
1 code implementation • ICLR 2020 • Dongze Lian, Yin Zheng, Yintao Xu, Yanxiong Lu, Leyu Lin, Peilin Zhao, Junzhou Huang, Shenghua Gao
Recently, Neural Architecture Search (NAS) has been successfully applied to multiple artificial intelligence areas and shows better performance compared with hand-designed networks.
1 code implementation • CVPR 2020 • Zehao Yu, Shenghua Gao
On one hand, the high-resolution depth map, the data-adaptive propagation method and the Gauss-Newton layer jointly guarantee the effectiveness of our method.
no code implementations • 11 Dec 2019 • Huihong Zhang, Jianlong Yang, Kang Zhou, Zhenjie Chai, Jun Cheng, Shenghua Gao, Jiang Liu
Firstly, our method trains a biomarker prediction network to learn the features of the biomarker.
no code implementations • 28 Nov 2019 • Kang Zhou, Shenghua Gao, Jun Cheng, Zaiwang Gu, Huazhu Fu, Zhi Tu, Jianlong Yang, Yitian Zhao, Jiang Liu
With the development of convolutional neural network, deep learning has shown its success for retinal disease detection from optical coherence tomography (OCT) images.
2 code implementations • ICCV 2019 • Wen Liu, Zhixin Piao, Jie Min, Wenhan Luo, Lin Ma, Shenghua Gao
In this paper, we propose to use a 3D body mesh recovery module to disentangle the pose and shape, which can not only model the joint location and rotation but also characterize the personalized body shape.
no code implementations • 6 Aug 2019 • Tianyang Zhang, Huazhu Fu, Yitian Zhao, Jun Cheng, Mengjie Guo, Zaiwang Gu, Bing Yang, Yuting Xiao, Shenghua Gao, Jiang Liu
Generative Adversarial Networks (GANs) have the capability of synthesizing images, which have been successfully applied to medical image synthesis tasks.
1 code implementation • ECCV 2020 • Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, Zihan Zhou
Recently, there has been growing interest in developing learning-based methods to detect and utilize salient semi-global or global structures, such as junctions, lines, planes, cuboids, smooth surfaces, and all types of symmetries, for 3D scene modeling and understanding.
1 code implementation • 18 Jul 2019 • Yanyan Fang, Biyun Zhan, Wandi Cai, Shenghua Gao, Bo Hu
Then to relate the density maps between neighbouring frames, a Locality-constrained Spatial Transformer (LST) module is introduced to estimate the density map of next frame with that of current frame.
1 code implementation • 4 Jul 2019 • Dongze Lian, Zehao Yu, Shenghua Gao
There are two merits for our two-stage solution based gaze following: i) our solution mimics the behavior of human in gaze following, therefore it is more psychological plausible; ii) besides using heatmap to supervise the output of our network, we can also leverage gaze direction to facilitate the training of gaze direction pathway, therefore our network can be more robustly trained.
1 code implementation • 30 May 2019 • Ziheng Zhang, Anpei Chen, Ling Xie, Jingyi Yu, Shenghua Gao
Specifically, we first introduce a new representation, namely a semantics-aware distance map (sem-dist map), to serve as our target for amodal segmentation instead of the commonly used masks and heatmaps.
no code implementations • 10 May 2019 • Jin Chen, Xinxiao wu, Lixin Duan, Shenghua Gao
In this more general and practical scenario, a major challenge is how to select source instances in the shared classes across different domains for positive transfer.
1 code implementation • CVPR 2019 • Ziheng Zhang, Zhengxin Li, Ning Bi, Jia Zheng, Jinlei Wang, Kun Huang, Weixin Luo, Yanyu Xu, Shenghua Gao
In this paper, we present a novel framework to detect line segments in man-made environments.
no code implementations • 4 Apr 2019 • Minye Wu, Haibin Ling, Ning Bi, Shenghua Gao, Hao Sheng, Jingyi Yu
A natural solution to these challenges is to use multiple cameras with multiview inputs, though existing systems are mostly limited to specific targets (e. g. human), static cameras, and/or camera calibration.
3 code implementations • 7 Mar 2019 • Zaiwang Gu, Jun Cheng, Huazhu Fu, Kang Zhou, Huaying Hao, Yitian Zhao, Tianyang Zhang, Shenghua Gao, Jiang Liu
In this paper, we propose a context encoder network (referred to as CE-Net) to capture more high-level information and preserve spatial information for 2D medical image segmentation.
Ranked #1 on Optic Disc Segmentation on Messidor
1 code implementation • CVPR 2019 • Zehao Yu, Jia Zheng, Dongze Lian, Zihan Zhou, Shenghua Gao
In the first stage, we train a CNN to map each pixel to an embedding space where pixels from the same plane instance have similar embeddings.
Ranked #1 on Plane Instance Segmentation on NYU Depth v2
no code implementations • 15 Oct 2018 • Anpei Chen, Minye Wu, Yingliang Zhang, Nianyi Li, Jie Lu, Shenghua Gao, Jingyi Yu
A surface light field represents the radiance of rays originating from any points on the surface in any directions.
no code implementations • ECCV 2018 • Hao Cheng, Dongze Lian, Shenghua Gao, Yanlin Geng
Inspired by the pioneering work of information bottleneck principle for Deep Neural Networks (DNNs) analysis, we design an information plane based framework to evaluate the capability of DNNs for image classification tasks, which not only helps understand the capability of DNNs, but also helps us choose a neural network which leads to higher classification accuracy more efficiently.
no code implementations • 31 Aug 2018 • Kang Zhou, Zaiwang Gu, Wen Liu, Weixin Luo, Jun Cheng, Shenghua Gao, Jiang Liu
To considering the relationships of images with different stages, we propose a \textbf{Multi-Task} learning strategy which predicts the label with both classification and regression.
2 code implementations • CVPR 2018 • Zongwei Wang, Xu Tang, Weixin Luo, Shenghua Gao
By grouping faces with target age together, the objective of face aging is equivalent to transferring aging patterns of faces within the target age group to the face whose aged face is to be synthesized.
1 code implementation • CVPR 2018 • Wen Liu, Weixin Luo, Dongze Lian, Shenghua Gao
To predict a future frame with higher quality for normal events, other than the commonly used appearance (spatial) constraints on intensity and gradient, we also introduce a motion (temporal) constraint in video prediction by enforcing the optical flow between predicted frames and ground truth frames to be consistent, and this is the first work that introduces a temporal constraint into the video prediction task.
no code implementations • CVPR 2018 • Yanyu Xu, Yanbing Dong, Junru Wu, Zhengzhong Sun, Zhiru Shi, Jingyi Yu, Shenghua Gao
This paper explores gaze prediction in dynamic $360^circ$ immersive videos, emph{i. e.}, based on the history scan path and VR contents, we predict where a viewer will look at an upcoming time.
1 code implementation • CVPR 2018 • Yanyu Xu, Zhixin Piao, Shenghua Gao
Specifically, motivated by the residual learning in deep learning, we propose to predict displacement between neighboring frames for each pedestrian sequentially.
1 code implementation • 28 Dec 2017 • Wen Liu, Weixin Luo, Dongze Lian, Shenghua Gao
To predict a future frame with higher quality for normal events, other than the commonly used appearance (spatial) constraints on intensity and gradient, we also introduce a motion (temporal) constraint in video prediction by enforcing the optical flow between predicted frames and ground truth frames to be consistent, and this is the first work that introduces a temporal constraint into the video prediction task.
Ranked #2 on Traffic Accident Detection on SA
1 code implementation • 9 Oct 2017 • Yanyu Xu, Shenghua Gao, Junru Wu, Nianyi Li, Jingyi Yu
Specifically, we propose to decompose a personalized saliency map (referred to as PSM) into a universal saliency map (referred to as USM) predictable by existing saliency detection models and a new discrepancy map across users that characterizes personalized saliency.
1 code implementation • ICCV 2017 • Weixin Luo, Wen Liu, Shenghua Gao
Motivated by the capability of sparse coding based anomaly detection, we propose a Temporally-coherent Sparse Coding (TSC) where we enforce similar neighbouring frames be encoded with similar reconstruction coefficients.
Ranked #22 on Anomaly Detection on ShanghaiTech
no code implementations • 8 Jul 2016 • Liansheng Zhuang, Zihan Zhou, Jingwen Yin, Shenghua Gao, Zhouchen Lin, Yi Ma, Nenghai Yu
In the literature, most existing graph-based semi-supervised learning (SSL) methods only use the label information of observed samples in the label propagation stage, while ignoring such valuable information when learning the graph.
no code implementations • CVPR 2016 • Bingbing Ni, Xiaokang Yang, Shenghua Gao
Fine grained video action analysis often requires reliable detection and tracking of various interacting objects and human body parts, denoted as interactional object parsing.
5 code implementations • Conference 2016 • Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, Yi Ma
To this end, we have proposed a simple but effective Multi-column Convolutional Neural Network (MCNN) architecture to map the image to its crowd density map.
Ranked #5 on Crowd Counting on Venice
no code implementations • 3 Sep 2014 • Liansheng Zhuang, Shenghua Gao, Jinhui Tang, Jingjing Wang, Zhouchen Lin, Yi Ma
This paper aims at constructing a good graph for discovering intrinsic data structures in a semi-supervised learning setting.
2 code implementations • 14 Apr 2014 • Tsung-Han Chan, Kui Jia, Shenghua Gao, Jiwen Lu, Zinan Zeng, Yi Ma
In this work, we propose a very simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms.
Ranked #46 on Image Classification on MNIST
no code implementations • 31 Mar 2014 • Kui Jia, Tsung-Han Chan, Zinan Zeng, Shenghua Gao, Gang Wang, Tianzhu Zhang, Yi Ma
The task is to identify the inlier features and establish their consistent correspondences across the image set.
no code implementations • CVPR 2013 • Zinan Zeng, Shijie Xiao, Kui Jia, Tsung-Han Chan, Shenghua Gao, Dong Xu, Yi Ma
Our framework is motivated by the observation that samples from the same class repetitively appear in the collection of ambiguously labeled training images, while they are just ambiguously labeled in each image.