Home New Trending Search
About Privacy Terms
#
#gpqa
Posts tagged #gpqa on Bluesky
Post image

Curious how today’s top #LLMs stack up on real scientific reasoning? Here’s the current #GPQA leaderboard (graduate-level, Google-proof):
llm-stats.com/benchmarks/g...
#AI #LLM

0 0 0 0
Post image

VibeThinker‑1.5B just outpaced DeepSeek‑R1, hitting $7.8K performance and matching bigger models on math and code tasks. Curious how it runs on edge devices? Dive into the details! #VibeThinker1_5B #DeepSeekR1 #GPQA

🔗 aidailypost.com/news/weibos-...

0 0 0 0
Post image

ChatGPT o3 Pro: новый флагман OpenAI или маркетинговый ход? Разбираемся OpenAI снова удивляет: новая модель ChatGPT o3 Pro об...

#chatgpt #o3 #pro #openai #бенчмарки #aime #gpqa #codeforces #chatbot #arena #nyt

Origin | Interest | Match

0 0 0 0
Post image

📊 #DeepSeek-R1 and R1-32B are making waves!

Crushing benchmarks across #AIME, #Codeforces, MATH-500 & more.

From #GPQA precision to SWE-bench prowess ... It’s clear: DeepSeek isn’t here to compete; it’s here to lead.

#AI #DeepSeek #Benchmarks #MachineLearning #OpenAI #ML

3 0 0 0
Post image

Plotting #GPQA based on release date indicates a curve that certainly looks exponential. #e/acc

0 0 0 0
Post image Post image

New #AI Model Shows Strong Mathematical Reasoning Capabilities 📊

#DeepSeek R1 Lite Preview matches #o1preview performance with 52.5% accuracy on #AIME2024, showing promising results in #Math and #GPQA benchmarks. Performance scales with increased thinking tokens. Try at chat.deepseek.com

4 0 0 0