1 code implementation • 22 Apr 2024 • Yuhong Li, Yingbing Huang, Bowen Yang, Bharat Venkitesh, Acyr Locatelli, Hanchen Ye, Tianle Cai, Patrick Lewis, Deming Chen
Specifically, SnapKV achieves a consistent decoding speed with a 3. 6x increase in generation speed and an 8. 2x enhancement in memory efficiency compared to baseline when processing inputs of 16K tokens.
1 code implementation • 19 Jan 2024 • Tianle Cai, Yuhong Li, Zhengyang Geng, Hongwu Peng, Jason D. Lee, Deming Chen, Tri Dao
We present two levels of fine-tuning procedures for Medusa to meet the needs of different use cases: Medusa-1: Medusa is directly fine-tuned on top of a frozen backbone LLM, enabling lossless inference acceleration.
no code implementations • ICCV 2023 • Yuhong Li, Jiajie Li, Cong Hao, Pan Li, JinJun Xiong, Deming Chen
We further propose a Discrete Proxy Search (DPS) method to find the optimized training settings for Eproxy with only a handful of benchmarked architectures on the target tasks.
no code implementations • ICCV 2023 • Zi Qian, Xin Wang, Xuguang Duan, Pengda Qin, Yuhong Li, Wenwu Zhu
Based on our formulation, we further propose MulTi-Modal PRompt LearnIng with DecouPLing bEfore InTeraction (TRIPLET), a novel approach that builds on a pre-trained vision-language model and consists of decoupled prompts and prompt interaction strategies to capture the complex interactions between modalities.
1 code implementation • 20 Nov 2022 • Xichen Pan, Pengda Qin, Yuhong Li, Hui Xue, Wenhu Chen
Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis capacity.
Ranked #1 on Story Visualization on Pororo
1 code implementation • 17 Oct 2022 • Yuhong Li, Jiajie Li, Cong Han, Pan Li, JinJun Xiong, Deming Chen
(2) Efficient proxies are not extensible to multi-modality downstream tasks.
1 code implementation • 17 Oct 2022 • Yuhong Li, Tianle Cai, Yi Zhang, Deming Chen, Debadeepta Dey
We focus on the structure of the convolution kernel and identify two critical but intuitive principles enjoyed by S4 that are sufficient to make up an effective global convolutional model: 1) The parameterization of the convolutional kernel needs to be efficient in the sense that the number of parameters should scale sub-linearly with sequence length.
Ranked #6 on Long-range modeling on LRA
no code implementations • 6 Jun 2022 • Xiaofan Zhang, Yao Chen, Cong Hao, Sitao Huang, Yuhong Li, Deming Chen
Deep Neural Networks (DNNs) have achieved great success in a variety of machine learning (ML) applications, delivering high-quality inferencing solutions in computer vision, natural language processing, and virtual reality, etc.
2 code implementations • NeurIPS 2021 • Yuhong Li, Cong Hao, Pan Li, JinJun Xiong, Deming Chen
Such a self-supervised regression task can effectively evaluate the intrinsic power of an architecture to capture and transform the input signal patterns, and allow more sufficient usage of training samples.
Ranked #1 on Neural Architecture Search on NAS-Bench-101 (Spearman Correlation metric)
no code implementations • 13 Jun 2021 • Runshi Liu, Pengda Qin, Yuhong Li, Weigao Wen, Dong Li, Kefeng Deng, Qiang Wu
Typically, the risk can be identified by jointly considering product content (e. g., title and image) and seller behavior.
no code implementations • 3 Jun 2021 • Pengda Qin, Yuhong Li, Kefeng Deng, Qiang Wu
Among ubiquitous multimodal data in the real world, text is the modality generated by human, while image reflects the physical world honestly.
1 code implementation • 1 Jan 2021 • Yuhong Li, Cong Hao, Xiaofan Zhang, JinJun Xiong, Wen-mei Hwu, Deming Chen
This raises the question of whether we can find an effective proxy search space (PS) that is only a small subset of GS to dramatically improve RandomNAS’s search efficiency while at the same time keeping a good correlation for the top-performing architectures.
no code implementations • 14 Oct 2020 • Cong Hao, Yao Chen, Xiaofan Zhang, Yuhong Li, JinJun Xiong, Wen-mei Hwu, Deming Chen
High quality AI solutions require joint optimization of AI algorithms, such as deep neural networks (DNNs), and their hardware accelerators.
no code implementations • 9 Jun 2020 • Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Yuan He, Hui Xue
Different from previous single-target attack models, our model can conduct target-conditioned attacks by learning the relations of attack target and the semantics in image.
no code implementations • 6 May 2020 • Yuhong Li, Cong Hao, Xiaofan Zhang, Xinheng Liu, Yao Chen, JinJun Xiong, Wen-mei Hwu, Deming Chen
We formulate the co-search problem by fusing DNN search variables and hardware implementation variables into one solution space, and maximize both algorithm accuracy and hardware implementation quality.
1 code implementation • 15 Nov 2019 • Kejiang Chen, Hang Zhou, Yuefeng Chen, Xiaofeng Mao, Yuhong Li, Yuan He, Hui Xue, Weiming Zhang, Nenghai Yu
Recent work has demonstrated that neural networks are vulnerable to adversarial examples.
no code implementations • 15 Nov 2019 • Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Yuan He, Hui Xue
To detect these adversarial examples, previous methods use artificially designed metrics to characterize the properties of \textit{adversarial subspaces} where adversarial examples lie.
2 code implementations • 14 Nov 2019 • Da Chen, Yuefeng Chen, Yuhong Li, Feng Mao, Yuan He, Hui Xue
In this paper, we proposed to train a more generalized embedding network with self-supervised learning (SSL) which can provide robust representation for downstream tasks by learning from the data itself.
Ranked #4 on Few-Shot Image Classification on Mini-ImageNet - 1-Shot Learning (using extra training data)
2 code implementations • 20 Sep 2019 • Xiaofan Zhang, Haoming Lu, Cong Hao, Jiachen Li, Bowen Cheng, Yuhong Li, Kyle Rupnow, JinJun Xiong, Thomas Huang, Honghui Shi, Wen-mei Hwu, Deming Chen
Object detection and tracking are challenging tasks for resource-constrained embedded systems.
1 code implementation • 25 Jun 2019 • Xiaofan Zhang, Cong Hao, Haoming Lu, Jiachen Li, Yuhong Li, Yuchen Fan, Kyle Rupnow, JinJun Xiong, Thomas Huang, Honghui Shi, Wen-mei Hwu, Deming Chen
Developing artificial intelligence (AI) at the edge is always challenging, since edge devices have limited computation capability and memory resources but need to meet demanding requirements, such as real-time processing, high throughput performance, and high inference accuracy.
2 code implementations • 20 May 2019 • Xiaofan Zhang, Cong Hao, Yuhong Li, Yao Chen, JinJun Xiong, Wen-mei Hwu, Deming Chen
Developing deep learning models for resource-constrained Internet-of-Things (IoT) devices is challenging, as it is difficult to achieve both good quality of results (QoR), such as DNN model inference accuracy, and quality of service (QoS), such as inference latency, throughput, and power consumption.
2 code implementations • 9 Apr 2019 • Cong Hao, Xiaofan Zhang, Yuhong Li, Sitao Huang, JinJun Xiong, Kyle Rupnow, Wen-mei Hwu, Deming Chen
While embedded FPGAs are attractive platforms for DNN acceleration on edge-devices due to their low latency and high energy efficiency, the scarcity of resources of edge-scale FPGA devices also makes it challenging for DNN deployment.
1 code implementation • 18 Mar 2019 • Xiaofeng Mao, Yuefeng Chen, Yuhong Li, Tao Xiong, Yuan He, Hui Xue
The task of Language-Based Image Editing (LBIE) aims at generating a target image by editing the source image based on the given language description.
4 code implementations • 7 Feb 2019 • Yuhong Li, Xiaofan Zhang, Deming Chen
It combines a Convolutional Neural Network (CNN) backbone and a cross-correlation operator, and takes advantage of the features from exemplary images for more accurate object tracking.
Ranked #1 on Visual Object Tracking on OTB-50
13 code implementations • CVPR 2018 • Yuhong Li, Xiaofan Zhang, Deming Chen
We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance.
Ranked #3 on Crowd Counting on Venice