The article contributes to understanding LLM sustainability and efficiency, providing three strategies for reducing inference cost:
- prompt adaptation,
- LLM approximation, and
- LLM cascade,
to address the high cost of using LLMs on large collections of queries and text.
FrugalGPT, incorporates an LLM cascade that learns which combinations of LLMs to use for different queries to reduce cost and improve accuracy.
FrugalGPT has been shown to match the performance of the best individual Large Language Models with up to 98% cost reduction or improve accuracy over GPT-4 by 4% with the same cost.
Source: Cornell University
Leave a Reply