Search Results for author: Hang Zhou

Found 70 papers, 32 papers with code

Prior Constraints-based Reward Model Training for Aligning Large Language Models

1 code implementation1 Apr 2024 Hang Zhou, Chenglong Wang, Yimin Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu

Reinforcement learning with human feedback for aligning large language models (LLMs) trains a reward model typically using ranking loss with comparison pairs. However, the training procedure suffers from an inherent problem: the uncontrolled scaling of reward scores during reinforcement learning due to the lack of constraints while training the reward model. This paper proposes a Prior Constraints-based Reward Model (namely PCRM) training method to mitigate this problem.

reinforcement-learning

Attacking Transformers with Feature Diversity Adversarial Perturbation

no code implementations10 Mar 2024 Chenxing Gao, Hang Zhou, Junqing Yu, Yuteng Ye, Jiale Cai, Junle Wang, Wei Yang

Understanding the mechanisms behind Vision Transformer (ViT), particularly its vulnerability to adversarial perturba tions, is crucial for addressing challenges in its real-world applications.

Fine-grained Appearance Transfer with Diffusion Models

1 code implementation27 Nov 2023 Yuteng Ye, Guanwen Li, Hang Zhou, Cai Jiale, Junqing Yu, Yawei Luo, Zikai Song, Qilong Xing, Youjia Zhang, Wei Yang

A pivotal aspect of our approach is the strategic use of the predicted $x_0$ space by diffusion models within the latent space of diffusion processes.

Image-to-Image Translation

DAE-Net: Deforming Auto-Encoder for fine-grained shape co-segmentation

1 code implementation22 Nov 2023 Zhiqin Chen, Qimin Chen, Hang Zhou, Hao Zhang

We present an unsupervised 3D shape co-segmentation method which learns a set of deformable part templates from a shape collection.

DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation

1 code implementation28 Sep 2023 Jiaxiang Tang, Jiawei Ren, Hang Zhou, Ziwei Liu, Gang Zeng

In contrast to the occupancy pruning used in Neural Radiance Fields, we demonstrate that the progressive densification of 3D Gaussians converges significantly faster for 3D generative tasks.

3D Generation

Progressive Text-to-Image Diffusion with Soft Latent Direction

1 code implementation18 Sep 2023 Yuteng Ye, Jiale Cai, Hang Zhou, Guanwen Li, Youjia Zhang, Zikai Song, Chenxing Gao, Junqing Yu, Wei Yang

In spite of the rapidly evolving landscape of text-to-image generation, the synthesis and manipulation of multiple entities while adhering to specific relational constraints pose enduring challenges.

Language Modelling Large Language Model +1

ReliTalk: Relightable Talking Portrait Generation from a Single Video

1 code implementation5 Sep 2023 Haonan Qiu, Zhaoxi Chen, Yuming Jiang, Hang Zhou, Xiangyu Fan, Lei Yang, Wayne Wu, Ziwei Liu

Our key insight is to decompose the portrait's reflectance from implicitly learned audio-driven facial normals and images.

Single-Image Portrait Relighting

Learning Evaluation Models from Large Language Models for Sequence Generation

no code implementations8 Aug 2023 Chenglong Wang, Hang Zhou, Kaiyan Chang, Tongran Liu, Chunliang Zhang, Quan Du, Tong Xiao, Jingbo Zhu

Large language models achieve state-of-the-art performance on sequence generation evaluation, but typically have a large number of parameters.

Machine Translation Style Transfer +1

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

3 code implementations4 Aug 2023 Chenglong Wang, Hang Zhou, Yimin Hu, Yifu Huo, Bei Li, Tongran Liu, Tong Xiao, Jingbo Zhu

Applying Reinforcement Learning (RL) to sequence generation models enables the direct optimization of long-term rewards (\textit{e. g.,} BLEU and human feedback), but typically requires large-scale sampling over a space of action sequences.

Abstractive Text Summarization Language Modelling +5

ShaDDR: Interactive Example-Based Geometry and Texture Generation via 3D Shape Detailization and Differentiable Rendering

1 code implementation8 Jun 2023 Qimin Chen, Zhiqin Chen, Hang Zhou, Hao Zhang

Furthermore, we showcase the ability of our method to learn geometric details and textures from shapes reconstructed from real-world photos.

Texture Synthesis

Detecting Errors in a Numerical Response via any Regression Model

2 code implementations26 May 2023 Hang Zhou, Jonas Mueller, Mayank Kumar, Jane-Ling Wang, Jing Lei

Noise plagues many numerical datasets, where the recorded values in the data may fail to match the true underlying values due to reasons including: erroneous sensors, data entry/processing mistakes, or imperfect human estimates.

regression

Building an Invisible Shield for Your Portrait against Deepfakes

no code implementations22 May 2023 Jiazhi Guan, Tianshu Hu, Hang Zhou, Zhizhi Guo, Lirui Deng, Chengbin Quan, Errui Ding, Youjian Zhao

Unlike authentic images, where the hidden messages can be extracted with precision, manipulating the facial attributes through deepfake techniques can disrupt the decoding process.

Face Swapping

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator

no code implementations CVPR 2023 Jiazhi Guan, Zhanwang Zhang, Hang Zhou, Tianshu Hu, Kaisiyuan Wang, Dongliang He, Haocheng Feng, Jingtuo Liu, Errui Ding, Ziwei Liu, Jingdong Wang

Despite recent advances in syncing lip movements with any audio waves, current methods still struggle to balance generation quality and the model's generalization ability.

Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection

1 code implementation10 Feb 2023 Hang Zhou, Junqing Yu, Wei Yang

To address this issue, we propose an Uncertainty Regulated Dual Memory Units (UR-DMU) model to learn both the representations of normal data and discriminative features of abnormal data.

Anomaly Detection Video Anomaly Detection

Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers

no code implementations9 Dec 2022 Yasheng Sun, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Zhibin Hong, Jingtuo Liu, Errui Ding, Jingdong Wang, Ziwei Liu, Hideki Koike

This requires masking a large percentage of the original image and seamlessly inpainting it with the aid of audio and reference frames.

Audio-Driven Co-Speech Gesture Video Generation

no code implementations5 Dec 2022 Xian Liu, Qianyi Wu, Hang Zhou, Yuanqi Du, Wayne Wu, Dahua Lin, Ziwei Liu

Our key insight is that the co-speech gestures can be decomposed into common motion patterns and subtle rhythmic dynamics.

Video Generation

Ada3Diff: Defending against 3D Adversarial Point Clouds via Adaptive Diffusion

no code implementations29 Nov 2022 Kui Zhang, Hang Zhou, Jie Zhang, Qidong Huang, Weiming Zhang, Nenghai Yu

Deep 3D point cloud models are sensitive to adversarial attacks, which poses threats to safety-critical applications such as autonomous driving.

Autonomous Driving Denoising

Dynamic Feature Pruning and Consolidation for Occluded Person Re-Identification

1 code implementation27 Nov 2022 Yuteng Ye, Hang Zhou, Jiale Cai, Chenxing Gao, Youjia Zhang, Junle Wang, Qiang Hu, Junqing Yu, Wei Yang

The framework mainly consists of a sparse encoder, a multi-view feature mathcing module, and a feature consolidation decoder.

Person Re-Identification

Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition

1 code implementation22 Nov 2022 Jiaxiang Tang, Kaisiyuan Wang, Hang Zhou, Xiaokang Chen, Dongliang He, Tianshu Hu, Jingtuo Liu, Gang Zeng, Jingdong Wang

While dynamic Neural Radiance Fields (NeRF) have shown success in high-fidelity 3D modeling of talking portraits, the slow training and inference speed severely obstruct their potential usage.

Talking Face Generation

Person Text-Image Matching via Text-Feature Interpretability Embedding and External Attack Node Implantation

1 code implementation16 Nov 2022 Fan Li, Hang Zhou, Huafeng Li, Yafei Zhang, Zhengtao Yu

Specifically, we improve the interpretability of text features by providing them with consistent semantic information with image features to achieve the alignment of text and describe image region features. To address the challenges posed by the diversity of text and the corresponding person images, we treat the variation caused by diversity to features as caused by perturbation information and propose a novel adversarial attack and defense method to solve it.

Adversarial Attack Person Search +1

Learning to Immunize Images for Tamper Localization and Self-Recovery

no code implementations28 Oct 2022 Qichao Ying, Hang Zhou, Zhenxing Qian, Sheng Li, Xinpeng Zhang

Image immunization (Imuge) is a technology of protecting the images by introducing trivial perturbation, so that the protected images are immune to the viruses in that the tampered contents can be auto-recovered.

TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis

3 code implementations5 Oct 2022 Haixu Wu, Tengge Hu, Yong liu, Hang Zhou, Jianmin Wang, Mingsheng Long

TimesBlock can discover the multi-periodicity adaptively and extract the complex temporal variations from transformed 2D tensors by a parameter-efficient inception block.

Action Recognition Anomaly Detection +4

StyleSwap: Style-Based Generator Empowers Robust Face Swapping

no code implementations27 Sep 2022 Zhiliang Xu, Hang Zhou, Zhibin Hong, Ziwei Liu, Jiaming Liu, Zhizhi Guo, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Our core idea is to leverage a style-based generator to empower high-fidelity and robust face swapping, thus the generator's advantage can be adopted for optimizing identity similarity.

Face Swapping

PointCAT: Contrastive Adversarial Training for Robust Point Cloud Recognition

no code implementations16 Sep 2022 Qidong Huang, Xiaoyi Dong, Dongdong Chen, Hang Zhou, Weiming Zhang, Kui Zhang, Gang Hua, Nenghai Yu

Notwithstanding the prominent performance achieved in various applications, point cloud recognition models have often suffered from natural corruptions and adversarial perturbations.

StyleFaceV: Face Video Generation via Decomposing and Recomposing Pretrained StyleGAN3

1 code implementation16 Aug 2022 Haonan Qiu, Yuming Jiang, Hang Zhou, Wayne Wu, Ziwei Liu

Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.

Image Generation Video Generation

Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption

no code implementations21 Jul 2022 Jiazhi Guan, Hang Zhou, Mingming Gong, Errui Ding, Jingdong Wang, Youjian Zhao

Specifically, by carefully examining the spatial and temporal properties, we propose to disrupt a real video through a Pseudo-fake Generator and create a wide range of pseudo-fake videos for training.

DeepFake Detection Face Swapping

TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers

1 code implementation18 Jul 2022 Jihao Liu, Boxiao Liu, Hang Zhou, Hongsheng Li, Yu Liu

In this paper, we propose a novel data augmentation technique TokenMix to improve the performance of vision transformers.

Data Augmentation

Delving into Sequential Patches for Deepfake Detection

no code implementations6 Jul 2022 Jiazhi Guan, Hang Zhou, Zhibin Hong, Errui Ding, Jingdong Wang, Chengbin Quan, Youjian Zhao

Recent advances in face forgery techniques produce nearly visually untraceable deepfake videos, which could be leveraged with malicious intentions.

DeepFake Detection Face Swapping

Agriculture-Vision Challenge 2022 -- The Runner-Up Solution for Agricultural Pattern Recognition via Transformer-based Models

no code implementations23 Jun 2022 Zhicheng Yang, Jui-Hsin Lai, Jun Zhou, Hang Zhou, Chen Du, Zhongcheng Lai

The Agriculture-Vision Challenge in CVPR is one of the most famous and competitive challenges for global researchers to break the boundary between computer vision and agriculture sectors, aiming at agricultural pattern recognition from aerial images.

Data Augmentation

MultiEarth 2022 -- The Champion Solution for the Matrix Completion Challenge via Multimodal Regression and Generation

no code implementations17 Jun 2022 Bo Peng, Hongchen Liu, Hang Zhou, Yuchuan Gou, Jui-Hsin Lai

Earth observation satellites have been continuously monitoring the earth environment for years at different locations and spectral bands with different modalities.

Earth Observation Matrix Completion +2

MultiEarth 2022 -- The Champion Solution for Image-to-Image Translation Challenge via Generation Models

no code implementations17 Jun 2022 Yuchuan Gou, Bo Peng, Hongchen Liu, Hang Zhou, Jui-Hsin Lai

The MultiEarth 2022 Image-to-Image Translation challenge provides a well-constrained test bed for generating the corresponding RGB Sentinel-2 imagery with the given Sentinel-1 VV & VH imagery.

Image-to-Image Translation Translation

Image Protection for Robust Cropping Localization and Recovery

no code implementations6 Jun 2022 Qichao Ying, Hang Zhou, Xiaoxiao Hu, Zhenxing Qian, Sheng Li, Xinpeng Zhang

Existing image cropping detection schemes ignore that recovering the cropped-out contents can unveil the purpose of the behaved cropping attack.

Image Cropping

EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model

no code implementations30 May 2022 Xinya Ji, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Wayne Wu, Feng Xu, Xun Cao

Although significant progress has been made to audio-driven talking face generation, existing methods either neglect facial emotion or cannot be applied to arbitrary subjects.

Talking Face Generation

Few-Shot Head Swapping in the Wild

no code implementations CVPR 2022 Changyong Shu, Hemao Wu, Hang Zhou, Jiaming Liu, Zhibin Hong, Changxing Ding, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Particularly, seamless blending is achieved with the help of a Semantic-Guided Color Reference Creation procedure and a Blending UNet.

Face Swapping

Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing

2 code implementations25 Apr 2022 Haoyue Cheng, Zhaoyang Liu, Hang Zhou, Chen Qian, Wayne Wu, LiMin Wang

This paper focuses on the weakly-supervised audio-visual video parsing task, which aims to recognize all events belonging to each modality and localize their temporal boundaries.

Denoising valid

SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance

no code implementations25 Mar 2022 Xinchi Zhou, Dongzhan Zhou, Wanli Ouyang, Hang Zhou, Ziwei Liu, Di Hu

Recent years have witnessed the success of deep learning on the visual sound separation task.

Shape-invariant 3D Adversarial Point Clouds

1 code implementation CVPR 2022 Qidong Huang, Xiaoyi Dong, Dongdong Chen, Hang Zhou, Weiming Zhang, Nenghai Yu

In this paper, we propose a novel Point-Cloud Sensitivity Map to boost both the efficiency and imperceptibility of point perturbations.

Visual Sound Localization in the Wild by Cross-Modal Interference Erasing

1 code implementation13 Feb 2022 Xian Liu, Rui Qian, Hang Zhou, Di Hu, Weiyao Lin, Ziwei Liu, Bolei Zhou, Xiaowei Zhou

Specifically, we observe that the previous practice of learning only a single audio representation is insufficient due to the additive nature of audio signals.

Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation

no code implementations19 Jan 2022 Xian Liu, Yinghao Xu, Qianyi Wu, Hang Zhou, Wayne Wu, Bolei Zhou

Moreover, to enable portrait rendering in one unified neural radiance field, a Torso Deformation module is designed to stabilize the large-scale non-rigid torso motions.

SAC-GAN: Structure-Aware Image Composition

1 code implementation13 Dec 2021 Hang Zhou, Rui Ma, Ling-Xiao Zhang, Lin Gao, Ali Mahdavi-Amiri, Hao Zhang

Specifically, our network takes the semantic layout features from the input scene image, features encoded from the edges and silhouette in the input object patch, as well as a latent code as inputs, and generates a 2D spatial affine transform defining the translation and scaling of the object patch.

Image Augmentation Object

Hyperspectral Mixed Noise Removal via Subspace Representation and Weighted Low-rank Tensor Regularization

no code implementations13 Nov 2021 Hang Zhou, Yanchi Su, Zhanshan Li

Recently, the low-rank property of different components extracted from the image has been considered in man hyperspectral image denoising methods.

Hyperspectral Image Denoising Image Denoising

From Image to Imuge: Immunized Image Generation

1 code implementation27 Oct 2021 Qichao Ying, Zhenxing Qian, Hang Zhou, Haisheng Xu, Xinpeng Zhang, Siyi Li

At the recipient's side, the verifying network localizes the malicious modifications, and the original content can be approximately recovered by the decoder, despite the presence of the attacks.

Image Cropping Image Generation

Hiding Images into Images with Real-world Robustness

no code implementations12 Oct 2021 Qichao Ying, Hang Zhou, Xianhan Zeng, Haisheng Xu, Zhenxing Qian, Xinpeng Zhang

The existing image embedding networks are basically vulnerable to malicious attacks such as JPEG compression and noise adding, not applicable for real-world copyright protection tasks.

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation

1 code implementation CVPR 2021 Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu

While speech content information can be defined by learning the intrinsic synchronization between audio-visual modalities, we identify that a pose code will be complementarily learned in a modulated convolution-based reconstruction framework.

Talking Face Generation

Audio-Driven Emotional Video Portraits

1 code implementation CVPR 2021 Xinya Ji, Hang Zhou, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu

In this work, we present Emotional Video Portraits (EVP), a system for synthesizing high-quality video portraits with vivid emotional dynamics driven by audios.

Disentanglement Face Generation

Visually Informed Binaural Audio Generation without Binaural Audios

no code implementations CVPR 2021 Xudong Xu, Hang Zhou, Ziwei Liu, Bo Dai, Xiaogang Wang, Dahua Lin

Moreover, combined with binaural recordings, our method is able to further boost the performance of binaural audio generation under supervised settings.

Audio Generation

Reversible Watermarking in Deep Convolutional Neural Networks for Integrity Authentication

no code implementations9 Apr 2021 Xiquan Guan, Huamin Feng, Weiming Zhang, Hang Zhou, Jie Zhang, Nenghai Yu

Specifically, we present the reversible watermarking problem of deep convolutional neural networks and utilize the pruning theory of model compression technology to construct a host sequence used for embedding watermarking information by histogram shift.

Model Compression

Adversarial Examples Detection beyond Image Space

1 code implementation23 Feb 2021 Kejiang Chen, Yuefeng Chen, Hang Zhou, Chuan Qin, Xiaofeng Mao, Weiming Zhang, Nenghai Yu

To detect both few-perturbation attacks and large-perturbation attacks, we propose a method beyond image space by a two-stream architecture, in which the image stream focuses on the pixel artifacts and the gradient stream copes with the confidence artifacts.

LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud-based Deep Networks

no code implementations1 Nov 2020 Hang Zhou, Dongdong Chen, Jing Liao, Weiming Zhang, Kejiang Chen, Xiaoyi Dong, Kunlin Liu, Gang Hua, Nenghai Yu

To overcome these shortcomings, this paper proposes a novel label guided adversarial network (LG-GAN) for real-time flexible targeted point cloud attack.

Discriminability Distillation in Group Representation Learning

no code implementations ECCV 2020 Manyuan Zhang, Guanglu Song, Hang Zhou, Yu Liu

We show the discrimiability knowledge has good properties that can be distilled by a light-weight distillation network and can be generalized on the unseen target set.

Representation Learning

LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud Based Deep Networks

no code implementations CVPR 2020 Hang Zhou, Dongdong Chen, Jing Liao, Kejiang Chen, Xiaoyi Dong, Kunlin Liu, Weiming Zhang, Gang Hua, Nenghai Yu

To overcome these shortcomings, this paper proposes a novel label guided adversarial network (LG-GAN) for real-time flexible targeted point cloud attack.

Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

1 code implementation CVPR 2020 Hang Zhou, Jihao Liu, Ziwei Liu, Yu Liu, Xiaogang Wang

Though face rotation has achieved rapid progress in recent years, the lack of high-quality paired training data remains a great hurdle for existing methods.

3D Face Modelling Data Augmentation +1

Powerful Speaker Embedding Training Framework by Adversarially Disentangled Identity Representation

no code implementations27 Nov 2019 Jianwei Tai, Hang Zhou, Qingjia Huang, Xiaoqi Jia

The main challenge of speaker verification in the wild is the interference caused by irrelevant information in speech and the lack of speaker labels in speech datasets.

Speaker Verification

Self-supervised Adversarial Training

1 code implementation15 Nov 2019 Kejiang Chen, Hang Zhou, Yuefeng Chen, Xiaofeng Mao, Yuhong Li, Yuan He, Hui Xue, Weiming Zhang, Nenghai Yu

Recent work has demonstrated that neural networks are vulnerable to adversarial examples.

Self-Supervised Learning

A Graph-Based Framework to Bridge Movies and Synopses

no code implementations ICCV 2019 Yu Xiong, Qingqiu Huang, Lingfeng Guo, Hang Zhou, Bolei Zhou, Dahua Lin

On top of this dataset, we develop a framework to perform matching between movie segments and synopsis paragraphs.

Vision-Infused Deep Audio Inpainting

no code implementations ICCV 2019 Hang Zhou, Ziwei Liu, Xudong Xu, Ping Luo, Xiaogang Wang

Extensive experiments demonstrate that our framework is capable of inpainting realistic and varying audio segments with or without visual contexts.

Audio inpainting Image Inpainting

ET-GAN: Cross-Language Emotion Transfer Based on Cycle-Consistent Generative Adversarial Networks

no code implementations27 May 2019 Xiaoqi Jia, Jianwei Tai, Hang Zhou, Yakai Li, Weijuan Zhang, Haichao Du, Qingjia Huang

Despite the remarkable progress made in synthesizing emotional speech from text, it is still challenging to provide emotion information to existing speech segments.

Domain Adaptation Generative Adversarial Network +2

DUP-Net: Denoiser and Upsampler Network for 3D Adversarial Point Clouds Defense

1 code implementation ICCV 2019 Hang Zhou, Kejiang Chen, Weiming Zhang, Han Fang, Wenbo Zhou, Nenghai Yu

We propose a Denoiser and UPsampler Network (DUP-Net) structure as defenses for 3D adversarial point cloud classification, where the two modules reconstruct surface smoothness by dropping or adding points.

Denoising Point Cloud Classification

Hierarchical Neural Network Architecture In Keyword Spotting

no code implementations6 Nov 2018 Yixiao Qu, Sihao Xue, Zhenyi Ying, Hang Zhou, Jue Sun

Keyword Spotting (KWS) provides the start signal of ASR problem, and thus it is essential to ensure a high recall rate.

Keyword Spotting speech-recognition +1

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation

1 code implementation20 Jul 2018 Hang Zhou, Yu Liu, Ziwei Liu, Ping Luo, Xiaogang Wang

Talking face generation aims to synthesize a sequence of face images that correspond to a clip of speech.

Lip Reading Retrieval +2

Cannot find the paper you are looking for? You can Submit a new open access paper.