AI safety standards worldwide must keep up with the rapid development and deployment of AI technology. Our mission is to help accelerate the writing of AI safety standards.

Recent outputs & updates

  • NIST AI 800-2: Our Recommendations on Benchmark Lifecycle and Deprecation

    We submitted feedback to NIST on its draft AI 800-2 on Best Practices for Automated Benchmark Evaluations. We recommend that the draft treat benchmarks as active tools requiring lifecycle management, not static instruments. Our submission covers deprecation criteria, versioning, saturation, annotation quality, semantic drift, and the risks of relying on popular but flawed benchmarks. Our…

    Read more


  • Recommendations for the EU AI Act Digital Omnibus Trilogue

    We have published a report analysing the Council and European Parliament positions going into the EU AI Act Omnibus trilogue. We make several recommendations for the parties engaged in the trilogue. Our main concern in making these recommendations is that, while it is positive if the Omnibus can remove some administrative burdens, it should not…

    Read more


  • Defining AI Models and AI Systems: A Framework to Resolve the Boundary Problem

    AI regulation assigns distinct obligations to providers of AI models and AI systems, but the lack of clear, consistent definitions for “AI model” and “AI system” creates ambiguity across the value chain. This paper surveys the definitions used in academic literature and regulatory documents, and proposes conceptual and operational definitions for drawing a principled boundary…

    Read more


  • Recommendations on the European Parliament Amendments to the EU AI Act in the Digital Omnibus

    We have published a report analysing some of the 750+ AI Act amendments that were proposed by the European Parliament in the context of the EU AI Act Omnibus. We provide a first analysis of these amendments, highlighting specific ones that we either welcome or oppose, based on our area of expertise.

    Read more


  • A Scorecard for the Quality of AI Evaluations

    We have published a working draft of a Quality Scorecard for AI Evaluations, a standards-based framework for assessing the reliability, validity, and rigour of AI evaluations. The scorecard provides structured scoring across five dimensions and a classification system to match evaluations to appropriate governance and deployment contexts.

    Read more