Benchmarks that have been killed by LLM based systems

Killed by LLM is a project that documents public AI benchmarks that LLM-based AI systems have largely solved since 2018. Getting killed means that a benchmark no longer measures the frontier of AI technology as a challenge asking "Can AI do X?", but might still be a useful tool. Links to papers documenting fallen benchmarks are provided.

The project is on GitHub, and other people are invited to contribute new benchmarks that have been overcome.

UMBC Center for AI

More Information about Benchmarks that have been killed by LLM based systems

Tags:

ai
benchmark
llm

Posted: January 7, 2025, 9:18 AM

Read Original Post in myUMBC

Search UMBC

This page is archived, and being kept only for reference, research, or record-keeping purposes and is no longer updated.

Subscribe to UMBC Weekly Top Stories

I am interested in: