Embeddings are the bridge between the human world of language and the machine world of numbers, allowing AI models to process and generate textual data.
One use for the Kelvin Legal DataPack is to create a series of custom embedding models. To ensure that our embeddings are well-trained, we have created different embeddings for different problems in law.
Legal language is full of nuance and complexity.
“the legal domain has rare vocabulary terms such as ‘Habeas Corpus’, that are rarely found in domain-independent corpora. In addition, some common words imply a context-specific meaning in the legal domain. For example, the word ‘Corpus’ in ‘Habeas Corpus’ and ‘Text Corpus’ have different meanings in the legal domain.
While models built on large bodies of information (including mixtures of general English and legal English) can offer very solid performance on certain tasks, certain legal problems can still prove challenging.
Source: Kelvin Legal
Leave a Reply