AI researchers uncover ethical, legal risks to using popular data sets

The advent of chatbots that can answer questions and mimic human speech has kicked off a race to build bigger and better generative AI models. It has also triggered questions around copyright and fair use of text taken off the internet, a key component of the massive corpus of data required to train large AI systems.

But without proper licensingdevelopers are in the dark about potential copyright restrictions, limitations on commercial use or requirements to credit a data set’s creators.

Major AI companies are facing a flurry of copyright lawsuits from book authors, artists and coders. Meanwhile, publishers and social media forums are threatening to withhold data amid closed-door negotiations.

Source: Washington Post

Leave a Reply

Your email address will not be published. Required fields are marked *