no code implementations • 22 Mar 2024 • Kyungmin Lee, Kihyuk Sohn, Jinwoo Shin
Recent progress in text-to-3D generation has been achieved through the utilization of score distillation methods: they make use of the pre-trained text-to-image (T2I) diffusion models by distilling via the diffusion model training objective.
1 code implementation • 8 Mar 2024 • Yisol Choi, Sangkyung Kwak, Kyungmin Lee, Hyungwon Choi, Jinwoo Shin
Finally, we present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.
Ranked #1 on Virtual Try-on on VITON-HD
no code implementations • 19 Feb 2024 • Kyungmin Lee, Sangkyung Kwak, Kihyuk Sohn, Jinwoo Shin
In particular, our method results in a superior Pareto frontier to the baselines.
no code implementations • 4 Jul 2023 • Subin Kim, Kyungmin Lee, June Suk Choi, Jongheon Jeong, Kihyuk Sohn, Jinwoo Shin
Generative priors of large-scale text-to-image diffusion models enable a wide range of new generation and editing applications on diverse visual modalities.
1 code implementation • NeurIPS 2023 • Sangwoo Mo, Minkyu Kim, Kyungmin Lee, Jinwoo Shin
By combining these objectives, S-CLIP significantly enhances the training of CLIP using only a few image-text pairs, as demonstrated in various specialist domains, including remote sensing, fashion, scientific figures, and comics.
1 code implementation • 2 Mar 2023 • Jaehyun Nam, Jihoon Tack, Kyungmin Lee, Hankook Lee, Jinwoo Shin
Learning with few labeled tabular samples is often an essential requirement for industrial machine learning applications as varieties of tabular data suffer from high annotation costs or have difficulties in collecting new samples for novel tasks.
1 code implementation • 26 Jan 2023 • Younghyun Kim, Sangwoo Mo, Minkyu Kim, Kyungmin Lee, Jaeho Lee, Jinwoo Shin
The keyword explanation form of visual bias offers several advantages, such as a clear group naming for bias discovery and a natural extension for debiasing using these group names.
no code implementations • 22 Aug 2022 • Gilhyun Nam, Gyeongjae Choi, Kyungmin Lee
In sum, we refer to our method as Guided Causal Invariant Syn-to-real Generalization that effectively improves the performance of syn-to-real generalization.
1 code implementation • 12 Aug 2022 • Kyungmin Lee, Jinwoo Shin
Here, the choice of data augmentation is sensitive to the quality of learned representations: as harder the data augmentations are applied, the views share more task-relevant information, but also task-irrelevant one that can hinder the generalization capability of representation.
no code implementations • ICLR 2022 • Kyungmin Lee
Transferring representational knowledge of a model to another is a wide-ranging topic in machine learning.
no code implementations • 1 Jan 2021 • Kyungmin Lee, Seyoon Oh
In this work, we present an efficient method for randomized smoothing that does not require any re-training of classifiers.
no code implementations • 23 Jul 2020 • Kyungmin Lee, Chiyoun Park, Ilhwan Kim, Namhoon Kim, Jaewon Lee
Recurrent Neural Network Language Models (RNNLMs) have started to be used in various fields of speech recognition due to their outstanding performance.
1 code implementation • 23 Jul 2020 • Kyungmin Lee, Hyunwhan Joe, Hyeontaek Lim, Kwangyoun Kim, Sungsoo Kim, Chang Woo Han, Hong-Gee Kim
Input sequences are capsulized then sliced by a window size.
no code implementations • 2 Jan 2020 • Kwangyoun Kim, Kyungmin Lee, Dhananjaya Gowda, Junmo Park, Sungsoo Kim, Sichen Jin, Young-Yoon Lee, Jinsu Yeo, Daehyun Kim, Seokyeong Jung, Jungin Lee, Myoungji Han, Chanwoo Kim
In this paper, we present a new on-device automatic speech recognition (ASR) system based on monotonic chunk-wise attention (MoChA) models trained with large (> 10K hours) corpus.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 22 Dec 2019 • Chanwoo Kim, Sungsoo Kim, Kwangyoun Kim, Mehul Kumar, Jiyeon Kim, Kyungmin Lee, Changwoo Han, Abhinav Garg, Eunhyang Kim, Minkyoo Shin, Shatrughan Singh, Larry Heck, Dhananjaya Gowda
Our end-to-end speech recognition system built using this training infrastructure showed a 2. 44 % WER on test-clean of the LibriSpeech test set after applying shallow fusion with a Transformer language model (LM).
no code implementations • 29 Jul 2019 • Joseph C. Szabo, Kyungmin Lee, Vidya Madhavan, Nandini Trivedi
We elucidate the mechanism by which a Mott insulator transforms into a non-Fermi liquid metal upon increasing disorder at half filling.
Strongly Correlated Electrons Disordered Systems and Neural Networks
no code implementations • 30 Jan 2018 • Kyungmin Lee, Chiyoun Park, Namhoon Kim, Jaewon Lee
This paper presents methods to accelerate recurrent neural network based language models (RNNLMs) for online speech recognition systems.