Self-Knowledge Distillation

24 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks

XinshaoAmosWang/ProSelfLC-CVPR2021 CVPR 2021

Keywords: entropy minimisation, maximum entropy, confidence penalty, self knowledge distillation, label correction, label noise, semi-supervised learning, output regularisation

Revisiting Knowledge Distillation via Label Smoothing Regularization

yuanli2333/Teacher-free-Knowledge-Distillation CVPR 2020

Without any extra computation cost, Tf-KD achieves up to 0. 65\% improvement on ImageNet over well-established baseline models, which is superior to label smoothing regularization.

Preservation of the Global Knowledge by Not-True Distillation in Federated Learning

Lee-Gihun/FedNTD 6 Jun 2021

In federated learning, a strong global model is collaboratively learned by aggregating clients' locally trained models.

Regularizing Class-wise Predictions via Self-knowledge Distillation

alinlab/cs-kd CVPR 2020

Deep neural networks with millions of parameters may suffer from poor generalization due to overfitting.

Self-Knowledge Distillation with Progressive Refinement of Targets

lgcnsai/ps-kd-pytorch ICCV 2021

Hence, it can be interpreted within a framework of knowledge distillation as a student becomes a teacher itself.

Noisy Self-Knowledge Distillation for Text Summarization

nlpyang/NoisySumm NAACL 2021

In this paper we apply self-knowledge distillation to text summarization which we argue can alleviate problems with maximum-likelihood training on single reference and noisy datasets.

Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation

Kennethborup/self_distillation NeurIPS 2021

Knowledge distillation is classically a procedure where a neural network is trained on the output of another network along with the original targets in order to transfer knowledge between the architectures.

Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation

MingiJi/FRSKD CVPR 2021

Knowledge distillation is a method of transferring the knowledge from a pretrained complex teacher model to a student model, so a smaller network can replace a large teacher network at the deployment stage.

Robust and Accurate Object Detection via Self-Knowledge Distillation

grispeut/udfa 14 Nov 2021

In this paper, we propose Unified Decoupled Feature Alignment (UDFA), a novel fine-tuning paradigm which achieves better performance than existing methods, by fully exploring the combination between self-knowledge distillation and adversarial training for object detection.

Improving Sequential Recommendations via Bidirectional Temporal Data Augmentation with Pre-training

juyongjiang/barec 13 Dec 2021

Our approach leverages bidirectional temporal augmentation and knowledge-enhanced fine-tuning to synthesize authentic pseudo-prior items that \emph{retain user preferences and capture deeper item semantic correlations}, thus boosting the model's expressive power.