1 code implementation • 10 Apr 2024 • Jianzhi Liu, Hexiang Gu, Tianyu Zheng, Liuyu Xiang, Huijia Wu, Jie Fu, Zhaofeng He
We propose a new metric to assess personality generation capability based on this evaluation method.
1 code implementation • 20 Feb 2024 • Hao Zhao, Zihan Qiu, Huijia Wu, Zili Wang, Zhaofeng He, Jie Fu
The Mixture of Experts (MoE) for language models has been proven effective in augmenting the capacity of models by dynamically routing each input token to a specific subset of experts for processing.
no code implementations • 3 Jan 2017 • Huijia Wu, Jiajun Zhang, Cheng-qing Zong
To simply the stacked architecture, we propose a framework called shortcut block, which is a marriage of the gating mechanism and shortcuts, while discarding the self-connected part in LSTM cell.
no code implementations • COLING 2016 • Huijia Wu, Jiajun Zhang, Cheng-qing Zong
In this paper, we empirically explore the effects of various kinds of skip connections in stacked bidirectional LSTMs for sequential tagging.
no code implementations • 10 Oct 2016 • Huijia Wu, Jiajun Zhang, Cheng-qing Zong
These motivate us to build a supertagger with a dynamic window approach, which can be treated as an attention mechanism on the local contexts.