no code implementations • IWSLT (ACL) 2022 • Yuhao Zhang, Canan Huang, Chen Xu, Xiaoqian Liu, Bei Li, Anxiang Ma, Tong Xiao, Jingbo Zhu
This paper describes NiuTrans’s submission to the IWSLT22 English-to-Chinese (En-Zh) offline speech translation task.
no code implementations • WMT (EMNLP) 2021 • Chenglong Wang, Chi Hu, Yongyu Mu, Zhongxiang Yan, Siming Wu, Yimin Hu, Hang Cao, Bei Li, Ye Lin, Tong Xiao, Jingbo Zhu
This paper describes the NiuTrans system for the WMT21 translation efficiency task.
no code implementations • WMT (EMNLP) 2020 • Yuhao Zhang, Ziyang Wang, Runzhe Cao, Binghao Wei, Weiqiao Shan, Shuhan Zhou, Abudurexiti Reheman, Tao Zhou, Xin Zeng, Laohu Wang, Yongyu Mu, Jingnan Zhang, Xiaoqian Liu, Xuanjun Zhou, Yinqiao Li, Bei Li, Tong Xiao, Jingbo Zhu
This paper describes NiuTrans neural machine translation systems of the WMT20 news translation tasks.
1 code implementation • 14 Mar 2024 • Yongyu Mu, Peinan Feng, Zhiquan Cao, Yuzhang Wu, Bei Li, Chenglong Wang, Tong Xiao, Kai Song, Tongran Liu, Chunliang Zhang, Jingbo Zhu
In this study, we reveal an in-context learning (ICL) capability of multilingual large language models (LLMs): by translating the input to several languages, we provide Parallel Input in Multiple Languages (PiM) to LLMs, which significantly enhances their comprehension abilities.
no code implementations • 18 Dec 2023 • Yuhao Zhang, Kaiqi Kou, Bei Li, Chen Xu, Chunliang Zhang, Tong Xiao, Jingbo Zhu
End-to-end Speech Translation (ST) aims to convert speech into target text within a unified model.
1 code implementation • 7 Nov 2023 • Yuhao Zhang, Chen Xu, Bei Li, Hao Chen, Tong Xiao, Chunliang Zhang, Jingbo Zhu
Significant improvements in end-to-end speech translation (ST) have been achieved through the application of multi-task learning.
1 code implementation • 26 Oct 2023 • Yuxin Zuo, Bei Li, Chuanhao Lv, Tong Zheng, Tong Xiao, Jingbo Zhu
This paper presents an in-depth study of multimodal machine translation (MMT), examining the prevailing understanding that MMT systems exhibit decreased sensitivity to visual information when text inputs are complete.
1 code implementation • 23 Oct 2023 • Tong Zheng, Bei Li, Huiwen Bao, Weiqiao Shan, Tong Xiao, Jingbo Zhu
The design choices in Transformer feed-forward neural networks have resulted in significant computational and parameter overhead.
Ranked #23 on Machine Translation on WMT2014 English-German
no code implementations • 22 Oct 2023 • Kun Wei, Bei Li, Hang Lv, Quan Lu, Ning Jiang, Lei Xie
By introducing both cross-modal and conversational representations into the decoder, our model retains context over longer sentences without information loss, achieving relative accuracy improvements of 8. 8% and 23% on Mandarin conversation datasets HKUST and MagicData-RAMC, respectively, compared to the standard Conformer model.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 15 Sep 2023 • Qingyan Guo, Rui Wang, Junliang Guo, Bei Li, Kaitao Song, Xu Tan, Guoqing Liu, Jiang Bian, Yujiu Yang
Large Language Models (LLMs) excel in various tasks, but they rely on carefully crafted prompts that often demand substantial human effort.
3 code implementations • 4 Aug 2023 • Chenglong Wang, Hang Zhou, Yimin Hu, Yifu Huo, Bei Li, Tongran Liu, Tong Xiao, Jingbo Zhu
Applying Reinforcement Learning (RL) to sequence generation models enables the direct optimization of long-term rewards (\textit{e. g.,} BLEU and human feedback), but typically requires large-scale sampling over a space of action sequences.
1 code implementation • 31 May 2023 • Xiao Xu, Bei Li, Chenfei Wu, Shao-Yen Tseng, Anahita Bhiwandiwalla, Shachar Rosenman, Vasudev Lal, Wanxiang Che, Nan Duan
With only 4M VLP data, ManagerTower achieves superior performances on various downstream VL tasks, especially 79. 15% accuracy on VQAv2 Test-Std, 86. 56% IR@1 and 95. 64% TR@1 on Flickr30K.
no code implementations • 31 May 2023 • Bei Li, Rui Wang, Junliang Guo, Kaitao Song, Xu Tan, Hany Hassan, Arul Menezes, Tong Xiao, Jiang Bian, Jingbo Zhu
Large language models (LLMs) have shown remarkable success across a wide range of natural language generation tasks, where proper prompt designs make great impacts.
no code implementations • 27 May 2023 • Yongyu Mu, Abudurexiti Reheman, Zhiquan Cao, Yuchun Fan, Bei Li, Yinqiao Li, Tong Xiao, Chunliang Zhang, Jingbo Zhu
Using translation memories (TMs) as prompts is a promising approach to in-context learning of machine translation models.
no code implementations • 26 May 2023 • Bei Li, Yi Jing, Xu Tan, Zhen Xing, Tong Xiao, Jingbo Zhu
Learning multiscale Transformer models has been evidenced as a viable approach to augmenting machine translation systems.
1 code implementation • 20 Dec 2022 • Tong Zheng, Bei Li, Huiwen Bao, Tong Xiao, Jingbo Zhu
In this paper, we propose a novel architecture, the Enhanced Interactive Transformer (EIT), to address the issue of head degradation in self-attention mechanisms.
1 code implementation • 19 Jun 2022 • Bei Li, Tong Zheng, Yi Jing, Chengbo Jiao, Tong Xiao, Jingbo Zhu
In this work, we define those scales in different linguistic units, including sub-words, words and phrases.
2 code implementations • ACL 2022 • Bei Li, Chuanhao Lv, Zefan Zhou, Tao Zhou, Tong Xiao, Anxiang Ma, Jingbo Zhu
Previous work on multimodal machine translation (MMT) has focused on the way of incorporating vision features into translation but little attention is on the quality of vision models.
1 code implementation • ACL 2022 • Bei Li, Quan Du, Tao Zhou, Yi Jing, Shuhan Zhou, Xin Zeng, Tong Xiao, Jingbo Zhu, Xuebo Liu, Min Zhang
Inspired by this, we design a new architecture, {\it ODE Transformer}, which is analogous to the Runge-Kutta method that is well motivated in ODE.
no code implementations • WMT (EMNLP) 2021 • Shuhan Zhou, Tao Zhou, Binghao Wei, Yingfeng Luo, Yongyu Mu, Zefan Zhou, Chenglong Wang, Xuanjun Zhou, Chuanhao Lv, Yi Jing, Laohu Wang, Jingnan Zhang, Canan Huang, Zhongxiang Yan, Chi Hu, Bei Li, Tong Xiao, Jingbo Zhu
This paper describes NiuTrans neural machine translation systems of the WMT 2021 news translation tasks.
2 code implementations • WS 2020 • Chi Hu, Bei Li, Ye Lin, Yinqiao Li, Yanyang Li, Chenglong Wang, Tong Xiao, Jingbo Zhu
This paper describes the submissions of the NiuTrans Team to the WNGT 2020 Efficiency Shared Task.
1 code implementation • 16 Sep 2021 • Chenglong Wang, Chi Hu, Yongyu Mu, Zhongxiang Yan, Siming Wu, Minyi Hu, Hang Cao, Bei Li, Ye Lin, Tong Xiao, Jingbo Zhu
This paper describes the NiuTrans system for the WMT21 translation efficiency task (http://statmt. org/wmt21/efficiency-task. html).
no code implementations • 2 Aug 2021 • Xiangxiang Zhu, Bei Li, Kunde Yang, Zhuosheng Zhang, Wenting Li
The standard chirplet transform (CT) with a chirp-modulated Gaussian window provides a valuable tool for analyzing linear chirp signals.
no code implementations • 6 Apr 2021 • Bei Li, Quan Du, Tao Zhou, Shuhan Zhou, Xin Zeng, Tong Xiao, Jingbo Zhu
We show that a residual block of layers in Transformer can be described as a higher-order solution to ODEs.
1 code implementation • 27 Dec 2020 • Bei Li, Ziyang Wang, Hui Liu, Quan Du, Tong Xiao, Chunliang Zhang, Jingbo Zhu
We proposed a novel group-permutation based knowledge distillation approach to compressing the deep Transformer model into a shallow model.
1 code implementation • EMNLP 2020 • Bei Li, Ziyang Wang, Hui Liu, Yufan Jiang, Quan Du, Tong Xiao, Huizhen Wang, Jingbo Zhu
We find that stacking layers is helpful in improving the representation ability of NMT models and adjacent layers perform similarly.
no code implementations • 23 Sep 2020 • Hongyi Li, You Lv, Xiaoliang Chen, Bei Li, Qi Hua, Fusui Ji, Yajun Yin, Hua Li
In real-time observations, the calculated velocity of a continuous ISF flow along fibers of a PACT pathway was 3. 6-15. 6 mm/sec.
no code implementations • ACL 2021 • Ye Lin, Yanyang Li, Ziyang Wang, Bei Li, Quan Du, Tong Xiao, Jingbo Zhu
Inspired by this, we investigate methods of model acceleration and compression in another line of research.
1 code implementation • ACL 2020 • Bei Li, Hui Liu, Ziyang Wang, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, Changliang Li
In encoder-decoder neural models, multiple encoders are in general used to represent the contextual information in addition to the individual sentence.
no code implementations • WS 2019 • Bei Li, Yinqiao Li, Chen Xu, Ye Lin, Jiqiang Liu, Hui Liu, Ziyang Wang, Yuhao Zhang, Nuo Xu, Zeyang Wang, Kai Feng, Hexuan Chen, Tengbo Liu, Yanyang Li, Qiang Wang, Tong Xiao, Jingbo Zhu
We participated in 13 translation directions, including 11 supervised tasks, namely EN↔{ZH, DE, RU, KK, LT}, GU→EN and the unsupervised DE↔CS sub-track.
2 code implementations • ACL 2019 • Qiang Wang, Bei Li, Tong Xiao, Jingbo Zhu, Changliang Li, Derek F. Wong, Lidia S. Chao
Transformer is the state-of-the-art model in recent machine translation evaluations.
no code implementations • WS 2018 • Qiang Wang, Bei Li, Jiqiang Liu, Bojian Jiang, Zheyang Zhang, Yinqiao Li, Ye Lin, Tong Xiao, Jingbo Zhu
This paper describes the submission of the NiuTrans neural machine translation system for the WMT 2018 Chinese ↔ English news translation tasks.
no code implementations • 15 Feb 2017 • Zhiyuan Zha, Xin Yuan, Bei Li, Xinggan Zhang, Xin Liu, Lan Tang, Ying-Chang Liang
However, it still lacks a sound mathematical explanation on why WNNM is more feasible than NNM.