A virtual workshop hosted by Stanford HAI's Center for Research on Foundation Models, MIT, Princeton's Centre for Information Technology Policy, and Humane Intelligence on October 28, 2024, examined the crucial role of third-party AI evaluations in assessing risks of general purpose AI systems.
While millions worldwide use AI systems like ChatGPT for writing, Claude for data analysis, and Stable Diffusion for image generation, these systems pose potential serious risks, including producing non-consensual intimate imagery, facilitating production of bioweapons, and contributing to biased decisions.
In her keynote address, Rumman Chowdhury, CEO at Humane Intelligence, compared the current situation to a new Gilded Age, characterised by major economic disruption and a lack of protections for users and citizens. She noted that the standard practice where "companies write their own tests and they grade themselves" can result in biased evaluations and limit standardisation, information sharing, and generalisability beyond specific settings.
The workshop spanned three sessions exploring evaluations in practice, evaluations by design, and evaluation law and policy. Speakers highlighted that while software security has developed reporting infrastructure, legal protections, and incentives like bug bounties, similar frameworks don't yet exist for general-purpose AI systems.
The workshop brought together diverse perspectives to articulate a vision for third-party AI evaluations, emphasizing the need for legal protections, standardized evaluation practices, and better terminology to support evaluators.