Search Results for author: Chenglong Wang

Found 44 papers, 17 papers with code

Prior Constraints-based Reward Model Training for Aligning Large Language Models

1 code implementation1 Apr 2024 Hang Zhou, Chenglong Wang, Yimin Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu

Reinforcement learning with human feedback for aligning large language models (LLMs) trains a reward model typically using ranking loss with comparison pairs. However, the training procedure suffers from an inherent problem: the uncontrolled scaling of reward scores during reinforcement learning due to the lack of constraints while training the reward model. This paper proposes a Prior Constraints-based Reward Model (namely PCRM) training method to mitigate this problem.

reinforcement-learning

Large Language Models are Parallel Multilingual Learners

1 code implementation14 Mar 2024 Yongyu Mu, Peinan Feng, Zhiquan Cao, Yuzhang Wu, Bei Li, Chenglong Wang, Tong Xiao, Kai Song, Tongran Liu, Chunliang Zhang, Jingbo Zhu

In this study, we reveal an in-context learning (ICL) capability of multilingual large language models (LLMs): by translating the input to several languages, we provide Parallel Input in Multiple Languages (PiM) to LLMs, which significantly enhances their comprehension abilities.

In-Context Learning

ContrastDiagnosis: Enhancing Interpretability in Lung Nodule Diagnosis Using Contrastive Learning

no code implementations8 Mar 2024 Chenglong Wang, Yinqiao Yi, Yida Wang, Chengxiu Zhang, Yun Liu, Kensaku MORI, Mei Yuan, Guang Yang

This framework is designed to introduce inherent transparency and provide extensive post-hoc explainability for deep learning model, making them more suitable for clinical medical diagnosis.

Contrastive Learning Medical Diagnosis

How Does Selection Leak Privacy: Revisiting Private Selection and Improved Results for Hyper-parameter Tuning

no code implementations20 Feb 2024 Zihang Xiang, Chenglong Wang, Di Wang

Recent works propose a generic private solution for the tuning process, yet a fundamental question still persists: is the current privacy bound for this solution tight?

What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection

1 code implementation15 Dec 2023 Xiaohui Zhang, Jiangyan Yi, Chenglong Wang, Chuyuan Zhang, Siding Zeng, JianHua Tao

The rapid evolution of speech synthesis and voice conversion has raised substantial concerns due to the potential misuse of such technology, prompting a pressing need for effective audio deepfake detection mechanisms.

Continual Learning DeepFake Detection +3

Data Formulator: AI-powered Concept-driven Visualization Authoring

no code implementations18 Sep 2023 Chenglong Wang, John Thompson, Bongshin Lee

We realize this paradigm in Data Formulator, an interactive visualization authoring tool.

Learning Evaluation Models from Large Language Models for Sequence Generation

no code implementations8 Aug 2023 Chenglong Wang, Hang Zhou, Kaiyan Chang, Tongran Liu, Chunliang Zhang, Quan Du, Tong Xiao, Jingbo Zhu

Large language models achieve state-of-the-art performance on sequence generation evaluation, but typically have a large number of parameters.

Machine Translation Style Transfer +1

Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection

1 code implementation7 Aug 2023 Xiaohui Zhang, Jiangyan Yi, JianHua Tao, Chenglong Wang, Chuyuan Zhang

The orthogonal weight modification to overcome catastrophic forgetting does not consider the similarity of genuine audio across different datasets.

Continual Learning Speech Emotion Recognition

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

3 code implementations4 Aug 2023 Chenglong Wang, Hang Zhou, Yimin Hu, Yifu Huo, Bei Li, Tongran Liu, Tong Xiao, Jingbo Zhu

Applying Reinforcement Learning (RL) to sequence generation models enables the direct optimization of long-term rewards (\textit{e. g.,} BLEU and human feedback), but typically requires large-scale sampling over a space of action sequences.

Abstractive Text Summarization Language Modelling +5

$\mathrm{SAM^{Med}}$: A medical image annotation framework based on large vision model

no code implementations11 Jul 2023 Chenglong Wang, Dexuan Li, Sucheng Wang, Chengxiu Zhang, Yida Wang, Yun Liu, Guang Yang

The $\mathrm{SAM^{assist}}$ demonstrates the generalization ability of SAM to the downstream medical segmentation task using the prompt-learning approach.

Image Segmentation Liver Segmentation +3

Is Self-Repair a Silver Bullet for Code Generation?

1 code implementation16 Jun 2023 Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao, Armando Solar-Lezama

We hypothesize that this is because self-repair is bottlenecked by the model's ability to provide feedback on its own code; using a stronger model to artificially boost the quality of the feedback, we observe substantially larger performance gains.

Code Generation

Low-rank Adaptation Method for Wav2vec2-based Fake Audio Detection

no code implementations9 Jun 2023 Chenglong Wang, Jiangyan Yi, Xiaohui Zhang, JianHua Tao, Le Xu, Ruibo Fu

Self-supervised speech models are a rapidly developing research topic in fake audio detection.

Learning From Yourself: A Self-Distillation Method for Fake Speech Detection

no code implementations2 Mar 2023 Jun Xue, Cunhang Fan, Jiangyan Yi, Chenglong Wang, Zhengqi Wen, Dan Zhang, Zhao Lv

To address this problem, we propose using the deepest network instruct shallow network for enhancing shallow networks.

Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection

no code implementations1 Feb 2023 Chenglong Wang, Yi Lu, Yongyu Mu, Yimin Hu, Tong Xiao, Jingbo Zhu

Knowledge distillation addresses the problem of transferring knowledge from a teacher model to a student model.

Knowledge Distillation

CodeExp: Explanatory Code Document Generation

1 code implementation25 Nov 2022 Haotian Cui, Chenglong Wang, JunJie Huang, Jeevana Priya Inala, Todd Mytkowicz, Bo wang, Jianfeng Gao, Nan Duan

Our experiments show that (1) our refined training dataset lets models achieve better performance in the explanation generation tasks compared to larger unrefined data (15x larger), and (2) fine-tuned models can generate well-structured long docstrings comparable to human-written ones.

Explanation Generation Text Generation

Execution-based Evaluation for Data Science Code Generation Models

1 code implementation17 Nov 2022 JunJie Huang, Chenglong Wang, Jipeng Zhang, Cong Yan, Haotian Cui, Jeevana Priya Inala, Colin Clement, Nan Duan, Jianfeng Gao

Code generation models can benefit data scientists' productivity by automatically generating code from context and text descriptions.

Code Generation Model Selection

System Fingerprint Recognition for Deepfake Audio: An Initial Dataset and Investigation

no code implementations21 Aug 2022 Xinrui Yan, Jiangyan Yi, Chenglong Wang, JianHua Tao, Junzuo Zhou, Hao Gu, Ruibo Fu

The rapid progress of deep speech synthesis models has posed significant threats to society such as malicious content manipulation.

Face Swapping Speech Synthesis

Fully Automated End-to-End Fake Audio Detection

no code implementations20 Aug 2022 Chenglong Wang, Jiangyan Yi, JianHua Tao, Haiyang Sun, Xun Chen, Zhengkun Tian, Haoxin Ma, Cunhang Fan, Ruibo Fu

The existing fake audio detection systems often rely on expert experience to design the acoustic features or manually design the hyperparameters of the network structure.

Interactive Code Generation via Test-Driven User-Intent Formalization

no code implementations11 Aug 2022 Shuvendu K. Lahiri, Sarah Fakhoury, Aaditya Naik, Georgios Sakkas, Saikat Chakraborty, Madanlal Musuvathi, Piali Choudhury, Curtis von Veh, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao

Large language models (LLMs) have shown great potential in automating significant aspects of coding by producing natural code from informal natural language (NL) intent.

Code Generation

Fault-Aware Neural Code Rankers

1 code implementation4 Jun 2022 Jeevana Priya Inala, Chenglong Wang, Mei Yang, Andres Codas, Mark Encarnación, Shuvendu K Lahiri, Madanlal Musuvathi, Jianfeng Gao

Large language models (LLMs) have demonstrated an impressive ability to generate code for various programming tasks.

Code Generation

Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions

1 code implementation28 May 2022 Ansong Ni, Jeevana Priya Inala, Chenglong Wang, Oleksandr Polozov, Christopher Meek, Dragomir Radev, Jianfeng Gao

We show that our use of self-sampled correct and partially-correct solutions can benefit learning and help guide the sampling process, leading to more efficient exploration of the solution space.

Arithmetic Reasoning Efficient Exploration +3

Towards Reliable and Explainable AI Model for Solid Pulmonary Nodule Diagnosis

no code implementations8 Apr 2022 Chenglong Wang, Yun Liu, Fen Wang, Chengxiu Zhang, Yida Wang, Mei Yuan, Guang Yang

However, detection and accurate diagnosis of pulmonary nodules depend heavily on the experiences of radiologists and can be a heavy workload for them.

The NiuTrans System for the WMT21 Efficiency Task

1 code implementation16 Sep 2021 Chenglong Wang, Chi Hu, Yongyu Mu, Zhongxiang Yan, Siming Wu, Minyi Hu, Hang Cao, Bei Li, Ye Lin, Tong Xiao, Jingbo Zhu

This paper describes the NiuTrans system for the WMT21 translation efficiency task (http://statmt. org/wmt21/efficiency-task. html).

Knowledge Distillation Translation

Falx: Synthesis-Powered Visualization Authoring

no code implementations1 Feb 2021 Chenglong Wang, Yu Feng, Rastislav Bodik, Isil Dillig, Alvin Cheung, Amy J. Ko

Modern visualization tools aim to allow data analysts to easily create exploratory visualizations.

Human-Computer Interaction Programming Languages

Organ Segmentation From Full-size CT Images Using Memory-Efficient FCN

no code implementations24 Mar 2020 Chenglong Wang, Masahiro Oda, Kensaku MORI

In this paper, we present a memory-efficient FCN to tackle the high GPU memory demand challenge in organ segmentation problem from clinical CT images.

Computed Tomography (CT) Image Segmentation +4

Scout: Rapid Exploration of Interface Layout Alternatives through High-Level Design Constraints

no code implementations15 Jan 2020 Amanda Swearngin, Chenglong Wang, Alannah Oleson, James Fogarty, Amy J. Ko

Although exploring alternatives is fundamental to creating better interface designs, current processes for creating alternatives are generally manual, limiting the alternatives a designer can explore.

Learning Transferable Graph Exploration

no code implementations NeurIPS 2019 Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh, Po-Sen Huang, Pushmeet Kohli

We propose a `learning to explore' framework where we learn a policy from a distribution of environments.

Efficient Exploration

Robust Text-to-SQL Generation with Execution-Guided Decoding

1 code implementation9 Jul 2018 Chenglong Wang, Kedar Tatwawadi, Marc Brockschmidt, Po-Sen Huang, Yi Mao, Oleksandr Polozov, Rishabh Singh

We consider the problem of neural semantic parsing, which translates natural language questions into executable SQL queries.

Semantic Parsing Text-To-SQL

Pointing Out SQL Queries From Text

no code implementations ICLR 2018 Chenglong Wang, Marc Brockschmidt, Rishabh Singh

We present a system that allows for querying data tables using natural language questions, where the system translates the question into an executable SQL query.

Cannot find the paper you are looking for? You can Submit a new open access paper.