yabs.io

Yet Another Bookmarks Service

[https://openreview.net/pdf?id=mdA5lVvNcU] - 2026-02-20 21:16:50 - public:mzimmerm

ai, benchmark, test - 3 | id:1538354 -

[https://www.mercor.com/apex/apex-agents-leaderboard/] - 2026-02-20 11:47:18 - public:mzimmerm

Mercor is a company that created APEX benchmark for AI models. The benchmark is concentrated on finance.

[https://codeclash.ai/] - 2025-12-12 15:53:15 - public:mzimmerm

benchmark, code, good, vibe - 4 | id:1536664 -

Compares writing code, rather than other benchmarks which do mostly git patches.

[https://www.swebench.com/] - 2025-12-12 11:40:44 - public:mzimmerm

benchmark, code, passmark, vibe, ai - 5 | id:1536663 -

Benchmark of AI coding models

With marked bookmarks

Mark all

| (+) | |

JSON XML RSS