The LaMini Dataset is an instruction dataset generated using h2ogpt-gm-oasst1-en-2048-falcon-40b-v2. It is designed for instruction-tuning pre-trained models to specialize them in a variety of downstream tasks.
Each entry in the dataset contains: - Instruction - Response
The LaMini Dataset can be used to fine-tune language models to improve their ability to follow instructions and generate relevant responses.
The dataset is available on HuggingFace at the following link: https://huggingface.co/datasets/SurgeGlobal/LaMini
If you find our work useful, please cite our paper as follows:
@misc{surge2024openbezoar,
title={OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data},
author={Chandeepa Dissanayake and Lahiru Lowe and Sachith Gunasekara and Yasiru Ratnayake},
year={2024},
eprint={2404.12195},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Chandeepa Dissanayake, Lahiru Lowe, Sachith Gunasekara, and Yasiru Ratnayake
Paper | Code | Results | Date | Stars |
---|