Search Results for author: Jingbo Zhu

Found 78 papers, 27 papers with code

Paper
Add Code

The NiuTrans’s Submission to the IWSLT22 English-to-Chinese Offline Speech Translation Task

no code implementations • IWSLT (ACL) 2022 • Yuhao Zhang, Canan Huang, Chen Xu, Xiaoqian Liu, Bei Li, Anxiang Ma, Tong Xiao, Jingbo Zhu

This paper describes NiuTrans’s submission to the IWSLT22 English-to-Chinese (En-Zh) offline speech translation task.

Machine Translation Translation

Paper
Add Code

The NiuTrans System for the WMT20 Quality Estimation Shared Task

no code implementations • WMT (EMNLP) 2020 • Chi Hu, Hui Liu, Kai Feng, Chen Xu, Nuo Xu, Zefan Zhou, Shiqin Yan, Yingfeng Luo, Chenglong Wang, Xia Meng, Tong Xiao, Jingbo Zhu

This paper describes the submissions of the NiuTrans Team to the WMT 2020 Quality Estimation Shared Task.

Machine Translation Multi-Task Learning +2

Paper
Add Code

The NiuTrans Machine Translation Systems for WMT20

no code implementations • WMT (EMNLP) 2020 • Yuhao Zhang, Ziyang Wang, Runzhe Cao, Binghao Wei, Weiqiao Shan, Shuhan Zhou, Abudurexiti Reheman, Tao Zhou, Xin Zeng, Laohu Wang, Yongyu Mu, Jingnan Zhang, Xiaoqian Liu, Xuanjun Zhou, Yinqiao Li, Bei Li, Tong Xiao, Jingbo Zhu

This paper describes NiuTrans neural machine translation systems of the WMT20 news translation tasks.

Knowledge Distillation Machine Translation +1

Paper
Add Code

利用图像描述与知识图谱增强表示的视觉问答(Exploiting Image Captions and External Knowledge as Representation Enhancement for Visual Question Answering)

no code implementations • CCL 2021 • Gechao Wang, Muhua Zhu, Chen Xu, Yan Zhang, Huizhen Wang, Jingbo Zhu

Image Captioning Question Answering +1

Paper
Add Code

The NiuTrans System for the WMT 2021 Efficiency Task

no code implementations • WMT (EMNLP) 2021 • Chenglong Wang, Chi Hu, Yongyu Mu, Zhongxiang Yan, Siming Wu, Yimin Hu, Hang Cao, Bei Li, Ye Lin, Tong Xiao, Jingbo Zhu

This paper describes the NiuTrans system for the WMT21 translation efficiency task.

Knowledge Distillation Translation

Paper
Add Code

Prior Constraints-based Reward Model Training for Aligning Large Language Models

1 code implementation • 1 Apr 2024 • Hang Zhou, Chenglong Wang, Yimin Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu

Reinforcement learning with human feedback for aligning large language models (LLMs) trains a reward model typically using ranking loss with comparison pairs. However, the training procedure suffers from an inherent problem: the uncontrolled scaling of reward scores during reinforcement learning due to the lack of constraints while training the reward model. This paper proposes a Prior Constraints-based Reward Model (namely PCRM) training method to mitigate this problem.

reinforcement-learning

Paper
Code

Efficient Prompting Methods for Large Language Models: A Survey

no code implementations • 1 Apr 2024 • Kaiyan Chang, Songcheng Xu, Chenglong Wang, Yingfeng Luo, Tong Xiao, Jingbo Zhu

In this paper, we present a comprehensive overview of these methods.

In-Context Learning

Paper
Add Code

RankPrompt: Step-by-Step Comparisons Make Language Models Better Reasoners

no code implementations • 19 Mar 2024 • Chi Hu, Yuan Ge, Xiangnan Ma, Hang Cao, Qiang Li, Yonghua Yang, Tong Xiao, Jingbo Zhu

Our experiments across 11 arithmetic and commonsense reasoning tasks show that RankPrompt significantly enhances the reasoning performance of ChatGPT and GPT-4, with improvements of up to 13%.

Paper
Add Code

Large Language Models are Parallel Multilingual Learners

1 code implementation • 14 Mar 2024 • Yongyu Mu, Peinan Feng, Zhiquan Cao, Yuzhang Wu, Bei Li, Chenglong Wang, Tong Xiao, Kai Song, Tongran Liu, Chunliang Zhang, Jingbo Zhu

In this study, we reveal an in-context learning (ICL) capability of multilingual large language models (LLMs): by translating the input to several languages, we provide Parallel Input in Multiple Languages (PiM) to LLMs, which significantly enhances their comprehension abilities.

In-Context Learning

Paper
Code

Soft Alignment of Modality Space for End-to-end Speech Translation

no code implementations • 18 Dec 2023 • Yuhao Zhang, Kaiqi Kou, Bei Li, Chen Xu, Chunliang Zhang, Tong Xiao, Jingbo Zhu

End-to-end Speech Translation (ST) aims to convert speech into target text within a unified model.

Cross-Lingual Transfer Translation

Paper
Add Code

Introduction to Transformers: an NLP Perspective

1 code implementation • 29 Nov 2023 • Tong Xiao, Jingbo Zhu

Transformers have dominated empirical machine learning models of natural language processing.

Paper
Code

Rethinking and Improving Multi-task Learning for End-to-end Speech Translation

1 code implementation • 7 Nov 2023 • Yuhao Zhang, Chen Xu, Bei Li, Hao Chen, Tong Xiao, Chunliang Zhang, Jingbo Zhu

Significant improvements in end-to-end speech translation (ST) have been achieved through the application of multi-task learning.

Multi-Task Learning

Paper
Code

Incorporating Probing Signals into Multimodal Machine Translation via Visual Question-Answering Pairs

1 code implementation • 26 Oct 2023 • Yuxin Zuo, Bei Li, Chuanhao Lv, Tong Zheng, Tong Xiao, Jingbo Zhu

This paper presents an in-depth study of multimodal machine translation (MMT), examining the prevailing understanding that MMT systems exhibit decreased sensitivity to visual information when text inputs are complete.

Attribute Multimodal Machine Translation +2

Paper
Code

PartialFormer: Modeling Part Instead of Whole

1 code implementation • 23 Oct 2023 • Tong Zheng, Bei Li, Huiwen Bao, Weiqiao Shan, Tong Xiao, Jingbo Zhu

The design choices in Transformer feed-forward neural networks have resulted in significant computational and parameter overhead.

Ranked #23 on Machine Translation on WMT2014 English-German

Abstractive Text Summarization Machine Translation +1

Paper
Code

Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition

1 code implementation • 21 Sep 2023 • Chen Xu, Xiaoqian Liu, Erfeng He, Yuhao Zhang, Qianqian Dong, Tong Xiao, Jingbo Zhu, Dapeng Man, Wu Yang

In this study, we present synchronous bilingual Connectionist Temporal Classification (CTC), an innovative framework that leverages dual CTC to bridge the gaps of both modality and language in the speech translation (ST) task.

speech-recognition Speech Recognition +1

Paper
Code

Learning Evaluation Models from Large Language Models for Sequence Generation

no code implementations • 8 Aug 2023 • Chenglong Wang, Hang Zhou, Kaiyan Chang, Tongran Liu, Chunliang Zhang, Quan Du, Tong Xiao, Jingbo Zhu

Large language models achieve state-of-the-art performance on sequence generation evaluation, but typically have a large number of parameters.

Machine Translation Style Transfer +1

Paper
Add Code

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

3 code implementations • 4 Aug 2023 • Chenglong Wang, Hang Zhou, Yimin Hu, Yifu Huo, Bei Li, Tongran Liu, Tong Xiao, Jingbo Zhu

Applying Reinforcement Learning (RL) to sequence generation models enables the direct optimization of long-term rewards (\textit{e. g.,} BLEU and human feedback), but typically requires large-scale sampling over a space of action sequences.

Abstractive Text Summarization Language Modelling +5

19,592

Paper
Code

Towards Robust Aspect-based Sentiment Analysis through Non-counterfactual Augmentations

no code implementations • 24 Jun 2023 • Xinyu Liu, Yan Ding, Kaikai An, Chunyang Xiao, Pranava Madhyastha, Tong Xiao, Jingbo Zhu

While state-of-the-art NLP models have demonstrated excellent performance for aspect based sentiment analysis (ABSA), substantial evidence has been presented on their lack of robustness.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

Paper
Add Code

Recent Advances in Direct Speech-to-text Translation

no code implementations • 20 Jun 2023 • Chen Xu, Rong Ye, Qianqian Dong, Chengqi Zhao, Tom Ko, Mingxuan Wang, Tong Xiao, Jingbo Zhu

Recently, speech-to-text translation has attracted more and more attention and many studies have emerged rapidly.

Data Augmentation Knowledge Distillation +2

Paper
Add Code

Understanding Parameter Sharing in Transformers

no code implementations • 15 Jun 2023 • Ye Lin, Mingxuan Wang, Zhexi Zhang, Xiaohui Wang, Tong Xiao, Jingbo Zhu

Inspired by this, we tune the training hyperparameters related to model convergence in a targeted manner.

Machine Translation

Paper
Add Code

Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation

1 code implementation • 13 Jun 2023 • Yuchen Han, Chen Xu, Tong Xiao, Jingbo Zhu

Pre-training and fine-tuning is a paradigm for alleviating the data scarcity problem in end-to-end speech translation (E2E ST).

Paper
Code

MobileNMT: Enabling Translation in 15MB and 30ms

1 code implementation • 7 Jun 2023 • Ye Lin, Xiaohui Wang, Zhexi Zhang, Mingxuan Wang, Tong Xiao, Jingbo Zhu

With the co-design of model and engine, compared with the existing system, we speed up 47. 0x and save 99. 5% of memory with only 11. 6% loss of BLEU.

Model Compression NMT +2

Paper
Code

Deliberate then Generate: Enhanced Prompting Framework for Text Generation

no code implementations • 31 May 2023 • Bei Li, Rui Wang, Junliang Guo, Kaitao Song, Xu Tan, Hany Hassan, Arul Menezes, Tong Xiao, Jiang Bian, Jingbo Zhu

Large language models (LLMs) have shown remarkable success across a wide range of natural language generation tasks, where proper prompt designs make great impacts.

Text Generation

Paper
Add Code

Augmenting Large Language Model Translators via Translation Memories

no code implementations • 27 May 2023 • Yongyu Mu, Abudurexiti Reheman, Zhiquan Cao, Yuchun Fan, Bei Li, Yinqiao Li, Tong Xiao, Chunliang Zhang, Jingbo Zhu

Using translation memories (TMs) as prompts is a promising approach to in-context learning of machine translation models.

In-Context Learning Language Modelling +4

Paper
Add Code

CTC-based Non-autoregressive Speech Translation

1 code implementation • 27 May 2023 • Chen Xu, Xiaoqian Liu, Xiaowen Liu, Qingxuan Sun, Yuhao Zhang, Murun Yang, Qianqian Dong, Tom Ko, Mingxuan Wang, Tong Xiao, Anxiang Ma, Jingbo Zhu

Combining end-to-end speech translation (ST) and non-autoregressive (NAR) generation is promising in language and speech processing for their advantages of less error propagation and low latency.

Translation

Paper
Code

Bridging the Granularity Gap for Acoustic Modeling

1 code implementation • 27 May 2023 • Chen Xu, Yuhao Zhang, Chengbo Jiao, Xiaoqian Liu, Chi Hu, Xin Zeng, Tong Xiao, Anxiang Ma, Huizhen Wang, Jingbo Zhu

While Transformer has become the de-facto standard for speech, modeling upon the fine-grained frame-level features remains an open challenge of capturing long-distance dependencies and distributing the attention weights.

speech-recognition Speech Recognition

Paper
Code

TranSFormer: Slow-Fast Transformer for Machine Translation

no code implementations • 26 May 2023 • Bei Li, Yi Jing, Xu Tan, Zhen Xing, Tong Xiao, Jingbo Zhu

Learning multiscale Transformer models has been evidenced as a viable approach to augmenting machine translation systems.

Machine Translation Translation

Paper
Add Code

Multi-Path Transformer is Better: A Case Study on Neural Machine Translation

no code implementations • 10 May 2023 • Ye Lin, Shuhan Zhou, Yanyang Li, Anxiang Ma, Tong Xiao, Jingbo Zhu

For years the model performance in machine learning obeyed a power-law relationship with the model size.

Machine Translation

Paper
Add Code

Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection

no code implementations • 1 Feb 2023 • Chenglong Wang, Yi Lu, Yongyu Mu, Yimin Hu, Tong Xiao, Jingbo Zhu

Knowledge distillation addresses the problem of transferring knowledge from a teacher model to a student model.

Knowledge Distillation

Paper
Add Code

Prompting Neural Machine Translation with Translation Memories

no code implementations • 13 Jan 2023 • Abudurexiti Reheman, Tao Zhou, Yingfeng Luo, Di Yang, Tong Xiao, Jingbo Zhu

Improving machine translation (MT) systems with translation memories (TMs) is of great interest to practitioners in the MT community.

Machine Translation NMT +1

Paper
Add Code

EIT: Enhanced Interactive Transformer

1 code implementation • 20 Dec 2022 • Tong Zheng, Bei Li, Huiwen Bao, Tong Xiao, Jingbo Zhu

In this paper, we propose a novel architecture, the Enhanced Interactive Transformer (EIT), to address the issue of head degradation in self-attention mechanisms.

Abstractive Text Summarization Language Modelling +2

Paper
Code

Improving End-to-end Speech Translation by Leveraging Auxiliary Speech and Text Data

no code implementations • 4 Dec 2022 • Yuhao Zhang, Chen Xu, Bojie Hu, Chunliang Zhang, Tong Xiao, Jingbo Zhu

We present a method for introducing a text encoder into pre-trained end-to-end speech translation systems.

Denoising Translation

Paper
Add Code

Learning Multiscale Transformer Models for Sequence Generation

1 code implementation • 19 Jun 2022 • Bei Li, Tong Zheng, Yi Jing, Chengbo Jiao, Tong Xiao, Jingbo Zhu

In this work, we define those scales in different linguistic units, including sub-words, words and phrases.

Paper
Code

On Vision Features in Multimodal Machine Translation

2 code implementations • ACL 2022 • Bei Li, Chuanhao Lv, Zefan Zhou, Tao Zhou, Tong Xiao, Anxiang Ma, Jingbo Zhu

Previous work on multimodal machine translation (MMT) has focused on the way of incorporating vision features into translation but little attention is on the quality of vision models.

Image Captioning Multimodal Machine Translation +3

Paper
Code

ODE Transformer: An Ordinary Differential Equation-Inspired Model for Sequence Generation

1 code implementation • ACL 2022 • Bei Li, Quan Du, Tao Zhou, Yi Jing, Shuhan Zhou, Xin Zeng, Tong Xiao, Jingbo Zhu, Xuebo Liu, Min Zhang

Inspired by this, we design a new architecture, {\it ODE Transformer}, which is analogous to the Runge-Kutta method that is well motivated in ODE.

Abstractive Text Summarization Machine Translation +1

Paper
Code

The NiuTrans Machine Translation Systems for WMT21

no code implementations • WMT (EMNLP) 2021 • Shuhan Zhou, Tao Zhou, Binghao Wei, Yingfeng Luo, Yongyu Mu, Zefan Zhou, Chenglong Wang, Xuanjun Zhou, Chuanhao Lv, Yi Jing, Laohu Wang, Jingnan Zhang, Canan Huang, Zhongxiang Yan, Chi Hu, Bei Li, Tong Xiao, Jingbo Zhu

This paper describes NiuTrans neural machine translation systems of the WMT 2021 news translation tasks.

Knowledge Distillation Machine Translation +1

Paper
Add Code

The NiuTrans System for the WMT21 Efficiency Task

1 code implementation • 16 Sep 2021 • Chenglong Wang, Chi Hu, Yongyu Mu, Zhongxiang Yan, Siming Wu, Minyi Hu, Hang Cao, Bei Li, Ye Lin, Tong Xiao, Jingbo Zhu

This paper describes the NiuTrans system for the WMT21 translation efficiency task (http://statmt. org/wmt21/efficiency-task. html).

Knowledge Distillation Translation

130

Paper
Code

The NiuTrans System for WNGT 2020 Efficiency Task

2 code implementations • WS 2020 • Chi Hu, Bei Li, Ye Lin, Yinqiao Li, Yanyang Li, Chenglong Wang, Tong Xiao, Jingbo Zhu

This paper describes the submissions of the NiuTrans Team to the WNGT 2020 Efficiency Shared Task.

Knowledge Distillation Machine Translation +2

377

Paper
Code

RankNAS: Efficient Neural Architecture Search by Pairwise Ranking

no code implementations • EMNLP 2021 • Chi Hu, Chenglong Wang, Xiangnan Ma, Xia Meng, Yinqiao Li, Tong Xiao, Jingbo Zhu, Changliang Li

This paper addresses the efficiency challenge of Neural Architecture Search (NAS) by formulating the task as a ranking problem.

Language Modelling Machine Translation +2

Paper
Add Code

Bag of Tricks for Optimizing Transformer Efficiency

1 code implementation • Findings (EMNLP) 2021 • Ye Lin, Yanyang Li, Tong Xiao, Jingbo Zhu

Improving Transformer efficiency has become increasingly attractive recently.

Quantization Translation

Paper
Code

The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task

no code implementations • ACL (IWSLT) 2021 • Chen Xu, Xiaoqian Liu, Xiaowen Liu, Laohu Wang, Canan Huang, Tong Xiao, Jingbo Zhu

This paper describes the submission of the NiuTrans end-to-end speech translation system for the IWSLT 2021 offline task, which translates from the English audio to German text directly without intermediate transcription.

Position Translation

Paper
Add Code

Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders

no code implementations • ACL 2021 • Chen Xu, Bojie Hu, Yanyang Li, Yuhao Zhang, Shen Huang, Qi Ju, Tong Xiao, Jingbo Zhu

To our knowledge, we are the first to develop an end-to-end ST system that achieves comparable or even better BLEU performance than the cascaded ST counterpart when large-scale ASR and MT data is available.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

ODE Transformer: An Ordinary Differential Equation-Inspired Model for Neural Machine Translation

no code implementations • 6 Apr 2021 • Bei Li, Quan Du, Tao Zhou, Shuhan Zhou, Xin Zeng, Tong Xiao, Jingbo Zhu

We show that a residual block of layers in Transformer can be described as a higher-order solution to ODEs.

Machine Translation Translation

Paper
Add Code

An Efficient Transformer Decoder with Compressed Sub-layers

no code implementations • 3 Jan 2021 • Yanyang Li, Ye Lin, Tong Xiao, Jingbo Zhu

The large attention-based encoder-decoder network (Transformer) has become prevailing recently due to its effectiveness.

Machine Translation Translation

Paper
Add Code

Learning Light-Weight Translation Models from Deep Transformer

1 code implementation • 27 Dec 2020 • Bei Li, Ziyang Wang, Hui Liu, Quan Du, Tong Xiao, Chunliang Zhang, Jingbo Zhu

We proposed a novel group-permutation based knowledge distillation approach to compressing the deep Transformer model into a shallow model.

Knowledge Distillation Machine Translation +2

Paper
Code

A Simple and Effective Approach to Robust Unsupervised Bilingual Dictionary Induction

no code implementations • COLING 2020 • Yanyang Li, Yingfeng Luo, Ye Lin, Quan Du, Huizhen Wang, ShuJian Huang, Tong Xiao, Jingbo Zhu

Our experiments show that this simple method does not hamper the performance of similar language pairs and achieves an accuracy of 13. 64~55. 53% between English and four distant languages, i. e., Chinese, Japanese, Vietnamese and Thai.

Dimensionality Reduction Self-Learning

Paper
Add Code

Dynamic Curriculum Learning for Low-Resource Neural Machine Translation

no code implementations • COLING 2020 • Chen Xu, Bojie Hu, Yufan Jiang, Kai Feng, Zeyang Wang, Shen Huang, Qi Ju, Tong Xiao, Jingbo Zhu

This eases training by highlighting easy samples that the current model has enough competence to learn.

Low-Resource Neural Machine Translation NMT +1

Paper
Add Code

Layer-Wise Multi-View Learning for Neural Machine Translation

no code implementations • COLING 2020 • Qiang Wang, Changliang Li, Yue Zhang, Tong Xiao, Jingbo Zhu

In this way, in addition to the topmost encoder layer (referred to as the primary view), we also incorporate an intermediate encoder layer as the auxiliary view.

Machine Translation MULTI-VIEW LEARNING +2

Paper
Add Code

Training Flexible Depth Model by Multi-Task Learning for Neural Machine Translation

no code implementations • Findings of the Association for Computational Linguistics 2020 • Qiang Wang, Tong Xiao, Jingbo Zhu

The standard neural machine translation model can only decode with the same depth configuration as training.

Machine Translation Multi-Task Learning +1

Paper
Add Code

Shallow-to-Deep Training for Neural Machine Translation

1 code implementation • EMNLP 2020 • Bei Li, Ziyang Wang, Hui Liu, Yufan Jiang, Quan Du, Tong Xiao, Huizhen Wang, Jingbo Zhu

We find that stacking layers is helpful in improving the representation ability of NMT models and adjacent layers perform similarly.

Machine Translation NMT +2

Paper
Code

Weight Distillation: Transferring the Knowledge in Neural Network Parameters

no code implementations • ACL 2021 • Ye Lin, Yanyang Li, Ziyang Wang, Bei Li, Quan Du, Tong Xiao, Jingbo Zhu

Inspired by this, we investigate methods of model acceleration and compression in another line of research.

Knowledge Distillation Machine Translation +1

Paper
Add Code

Towards Fully 8-bit Integer Inference for the Transformer Model

no code implementations • 17 Sep 2020 • Ye Lin, Yanyang Li, Tengbo Liu, Tong Xiao, Tongran Liu, Jingbo Zhu

8-bit integer inference, as a promising direction in reducing both the latency and storage of deep neural networks, has made great progress recently.

Language Modelling Quantization +1

Paper
Add Code

Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation

1 code implementation • ACL 2020 • Bei Li, Hui Liu, Ziyang Wang, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, Changliang Li

In encoder-decoder neural models, multiple encoders are in general used to represent the contextual information in addition to the individual sentence.

Machine Translation NMT +2

Paper
Code

Learning Architectures from an Extended Search Space for Language Modeling

no code implementations • ACL 2020 • Yinqiao Li, Chi Hu, Yuhao Zhang, Nuo Xu, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, Changliang Li

Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell.

Chunking Language Modelling +4

Paper
Add Code

Multi-layer Representation Fusion for Neural Machine Translation

1 code implementation • COLING 2018 • Qiang Wang, Fuxue Li, Tong Xiao, Yanyang Li, Yinqiao Li, Jingbo Zhu

In this paper, we propose a multi-layer representation fusion (MLRF) approach to fusing stacked layers.

Machine Translation Sentence +1

Paper
Code

Neural Machine Translation with Joint Representation

1 code implementation • 16 Feb 2020 • Yanyang Li, Qiang Wang, Tong Xiao, Tongran Liu, Jingbo Zhu

Though early successes of Statistical Machine Translation (SMT) systems are attributed in part to the explicit modelling of the interaction between any two source and target units, e. g., alignment, the recent Neural Machine Translation (NMT) systems resort to the attention which partially encodes the interaction for efficiency.

Machine Translation NMT +1

Paper
Code

Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition

1 code implementation • IJCNLP 2019 • Yufan Jiang, Chi Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu

In this paper, we study differentiable neural architecture search (NAS) methods for natural language processing.

Ranked #1 on Language Modelling on PTB Diagnostic ECG Database

Language Modelling named-entity-recognition +3

Paper
Code

The NiuTrans Machine Translation Systems for WMT19

no code implementations • WS 2019 • Bei Li, Yinqiao Li, Chen Xu, Ye Lin, Jiqiang Liu, Hui Liu, Ziyang Wang, Yuhao Zhang, Nuo Xu, Zeyang Wang, Kai Feng, Hexuan Chen, Tengbo Liu, Yanyang Li, Qiang Wang, Tong Xiao, Jingbo Zhu

We participated in 13 translation directions, including 11 supervised tasks, namely EN↔{ZH, DE, RU, KK, LT}, GU→EN and the unsupervised DE↔CS sub-track.

Knowledge Distillation Machine Translation +2

Paper
Add Code

Sharing Attention Weights for Fast Transformer

no code implementations • 26 Jun 2019 • Tong Xiao, Yinqiao Li, Jingbo Zhu, Zhengtao Yu, Tongran Liu

This is even 16 times faster than the baseline with no use of the attention cache.

Machine Translation Translation

Paper
Add Code

Shared-Private Bilingual Word Embeddings for Neural Machine Translation

no code implementations • ACL 2019 • Xuebo Liu, Derek F. Wong, Yang Liu, Lidia S. Chao, Tong Xiao, Jingbo Zhu

For similar source and target words, their embeddings tend to share a part of the features and they cooperatively learn these common representation units.

Machine Translation NMT +3

Paper
Add Code

Learning Deep Transformer Models for Machine Translation

2 code implementations • ACL 2019 • Qiang Wang, Bei Li, Tong Xiao, Jingbo Zhu, Changliang Li, Derek F. Wong, Lidia S. Chao

Transformer is the state-of-the-art model in recent machine translation evaluations.

Machine Translation Translation

113

Paper
Code

The NiuTrans Machine Translation System for WMT18

no code implementations • WS 2018 • Qiang Wang, Bei Li, Jiqiang Liu, Bojian Jiang, Zheyang Zhang, Yinqiao Li, Ye Lin, Tong Xiao, Jingbo Zhu

This paper describes the submission of the NiuTrans neural machine translation system for the WMT 2018 Chinese ↔ English news translation tasks.

Machine Translation Translation

Paper
Add Code

A Simple and Effective Approach to Coverage-Aware Neural Machine Translation

no code implementations • ACL 2018 • Yanyang Li, Tong Xiao, Yinqiao Li, Qiang Wang, Changming Xu, Jingbo Zhu

We offer a simple and effective method to seek a better balance between model confidence and length preference for Neural Machine Translation (NMT).

Machine Translation NMT +1

Paper
Add Code

Towards Bidirectional Hierarchical Representations for Attention-Based Neural Machine Translation

no code implementations • EMNLP 2017 • Baosong Yang, Derek F. Wong, Tong Xiao, Lidia S. Chao, Jingbo Zhu

This paper proposes a hierarchical attentional neural translation model which focuses on enhancing source-side hierarchical representations by covering both local and global semantic information using a bidirectional tree-based encoder.

Machine Translation Translation