Self-Knowledge Distillation
24 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Self-Knowledge Distillation
Most implemented papers
ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks
Keywords: entropy minimisation, maximum entropy, confidence penalty, self knowledge distillation, label correction, label noise, semi-supervised learning, output regularisation
Revisiting Knowledge Distillation via Label Smoothing Regularization
Without any extra computation cost, Tf-KD achieves up to 0. 65\% improvement on ImageNet over well-established baseline models, which is superior to label smoothing regularization.
Preservation of the Global Knowledge by Not-True Distillation in Federated Learning
In federated learning, a strong global model is collaboratively learned by aggregating clients' locally trained models.
Regularizing Class-wise Predictions via Self-knowledge Distillation
Deep neural networks with millions of parameters may suffer from poor generalization due to overfitting.
Self-Knowledge Distillation with Progressive Refinement of Targets
Hence, it can be interpreted within a framework of knowledge distillation as a student becomes a teacher itself.
Noisy Self-Knowledge Distillation for Text Summarization
In this paper we apply self-knowledge distillation to text summarization which we argue can alleviate problems with maximum-likelihood training on single reference and noisy datasets.
Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation
Knowledge distillation is classically a procedure where a neural network is trained on the output of another network along with the original targets in order to transfer knowledge between the architectures.
Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation
Knowledge distillation is a method of transferring the knowledge from a pretrained complex teacher model to a student model, so a smaller network can replace a large teacher network at the deployment stage.
Robust and Accurate Object Detection via Self-Knowledge Distillation
In this paper, we propose Unified Decoupled Feature Alignment (UDFA), a novel fine-tuning paradigm which achieves better performance than existing methods, by fully exploring the combination between self-knowledge distillation and adversarial training for object detection.
Improving Sequential Recommendations via Bidirectional Temporal Data Augmentation with Pre-training
Our approach leverages bidirectional temporal augmentation and knowledge-enhanced fine-tuning to synthesize authentic pseudo-prior items that \emph{retain user preferences and capture deeper item semantic correlations}, thus boosting the model's expressive power.