Home New Trending Search
About Privacy Terms
#
#macrobench
Posts tagged #macrobench on Bluesky
MacroBench: New Benchmark for LLM‑Driven Web Automation Scripts

MacroBench: New Benchmark for LLM‑Driven Web Automation Scripts

MacroBench, a new benchmark covering 681 web‑automation tasks, reports GPT‑4o‑Mini reaching a 96.8 % success rate while all models scored 0 % on complex multi‑step workflows. Read more: getnews.me/macrobench-new-benchmark... #macrobench #webautomation #llm

0 0 0 0