Google believes that human evaluation is superior to other methods to evaluate language models and that academic evaluations have a significant impact on the training datasets of AI models. Bard Gemini Pro performed well on less, and the new model is significantly better than its predecessor launched in March. The second-tier Gemini model, known as the Pro model, tied with GPT-4 in human evaluation and even outperformed two GPT-4 models in some aspects.
Source: The decoder
Leave a Reply