Data Authenticity, Consent, and Provenance for AI Are All Broken: What Will It Take to Fix Them?

The availability of massive and diverse training data collections has allowed for new advancements in AI. However, this has led to issues surrounding transparency, authenticity, consent, privacy, representation, bias, copyright infringement, and ethical and trustworthy AI systems. The regulation of AI is stressing the importance of transparency in training data to gain a better understanding of the limitations of AI models. A lack of transparency in organizing AI training data has hindered our ability to ensure data authenticity, consent, and to address issues of harm and bias in AI models. A unified framework dedicated to structured documentation of data properties is needed to tackle problems surrounding responsible and accountable AI. This framework would require the participation of multiple stakeholders.

Source: MIT






Leave a Reply

Your email address will not be published. Required fields are marked *