Harvey has introduced the BigLaw Bench, a method for evaluating the precision of genAI tools for legal tasks. They have shared their method and scores, achieving 74% for answer quality and 68% for source reliability, outperforming general LLMs. The evaluation includes customised rubrics for different legal tasks. Harvey’s transparent approach and dedication to further developing the BigLaw Benchmark is commendable. The legal genAI group in London should consider delving into the BigLaw Benchmark, and it’s essential to understand the multifaceted evaluations provided by Harvey.
Source: Artificial Lawyer
Leave a Reply