MacroBench: New Benchmark for LLM‑Driven Web Automation Scripts
MacroBench, a new benchmark covering 681 web‑automation tasks, reports GPT‑4o‑Mini reaching a 96.8 % success rate while all models scored 0 % on complex multi‑step workflows. Read more: getnews.me/macrobench-new-benchmark... #macrobench #webautomation #llm
0
0
0
0