KAROKAN-ACTAI Act Compliance BenchmarkQ4 2026

AI Act Compliance Benchmark

Name: KAROKAN-ACT: AI Act Compliance Benchmark
Creator: Karokan

A benchmark evaluating whether AI systems satisfy EU AI Act requirements: risk classification, documentation, transparency, and human oversight at model and system level.

Blog Research team

800+

Planned tasks

Regulatory articles

Compliance pillars

Q4 2026

Release

Leaderboard coming Q4 2026

First results will be published alongside the initial release. Contact the research team to participate in the pilot evaluation.

Task categories

Weighted contribution to the overall score

Risk Classification25%

Evaluating whether systems accurately self-assess prohibited, high-risk, or limited-risk status (Art. 5–6)

Technical Documentation25%

Completeness and accuracy of required documentation artifacts (Art. 11, Annex IV)

Transparency & Explainability25%

User notifications, limitations disclosure, and AI-generated content marking (Art. 13, Art. 50)

Human Oversight25%

Human-in-the-loop mechanisms and meaningful override capabilities (Art. 14)

Related research

Published work this benchmark builds on — and the gap it addresses

Regulation (EU) 2024/1689 — the Artificial Intelligence Act ↗

Official Journal of the European Union, 2024

The binding text: in force since August 2024, GPAI obligations applicable from August 2025, most high-risk system obligations phasing in through 2026–2027.

COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU AI Act ↗

Guldimann et al. (ETH Zürich, INSAIT, LatticeFlow), 2024 · arXiv:2410.07959

First open technical interpretation of the AI Act as model-level checks — found current frontier models underperform on robustness and fairness dimensions relative to capability.

General-Purpose AI Code of Practice ↗

European Commission, AI Office, 2025

The compliance baseline GPAI providers sign up to — KAROKAN-ACT's documentation and transparency tasks map to its measures.

About

KAROKAN-ACT will provide a systematic evaluation framework for EU AI Act compliance, enabling labs and deployers to assess their systems against binding regulatory requirements. The benchmark covers both the technical AI system layer and the organizational governance layer, reflecting the dual obligations under the AI Act for providers and deployers of high-risk AI systems.

Methodology

Tasks are organized along four regulatory pillars defined by the AI Act: risk classification accuracy, technical documentation completeness, transparency and explainability, and human oversight mechanisms. Each task is authored against specific articles of the regulation and reviewed by legal experts specializing in EU technology law. The scoring rubric maps directly to compliance evidence criteria expected by national market surveillance authorities.

Get involved

Review the benchmark design, submit a model for the pilot evaluation, or collaborate with our research team.

Contact research →

Other benchmarks

KAROKAN-EU2026

The European AI Productivity Index

Assesses whether frontier AI models can perform economically valuable professional tasks in European contexts — EU law, multi-country taxation, industrial standards, and cross-border regulatory analysis.

KAROKAN-LANG2026

European Multilingual Evaluation

A rigorous benchmark evaluating LLM quality beyond English, across all 24 official EU languages in professional and institutional contexts.