Tag: Large Language Models
-
Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
This article describes a mechanism that leverages large language model (LLM) rationales for small models within a multi-task training framework while using less training data than traditional methods such as fine tuning or distillation. Using substantially smaller model sizes than LLMs the authors show that they can reduce the model size and the data required […]