SKYLENAGE Benchmark Launches Multi-Level Math Evaluation for LLMs
SKYLENAGE introduces a benchmark with 100 reasoning items and 150 contest problems; the top model reached 44% accuracy on contests and 81% on reasoning. Read more: getnews.me/skylenage-benchmark-laun... #skylenage #mathbenchmark #llmevaluation