In conclusion, I think that the real score would definitely be much lower than a pass. Not to imply that an LLM (or whatever comes after) will not be able to do it in the future, but that moment is undoubtedly not today.
In conclusion, I think that the real score would definitely be much lower than a pass. Not to imply that an LLM (or whatever comes after) will not be able to do it in the future, but that moment is undoubtedly not today.
A2 is equally flawed. It gives the right solution, and a proof for n=1 but for n=>2 it basically bails out with "it's much harder to achieve a neat identity". So again, if the test requires a proof for the solution, this is definitely not it.
For those that heard that o1 aced the Putnam test, well - no, it did not. For A1 it does demonstrate that n=1 has infinite solutions and that n=2 has no solution, but gives no real demonstration for n>2. It would probably not score as a valid pass.
Monday. It's another monday.
I am testing QwQ-32B (using this quantized version: huggingface.co/sbeltz/QwQ-3... ) on my lowly AMD device. Oh boy, does it perform. It is very, very chatty, takes a looong time to think, but it reaches the right answer at the end.
If you're curious about the state of #quantumcomputing, one of the resources I find myself recommending (and referencing) is Olivier Ezratty's "Understanding Quantum Technologies".
It's available for free on Arxiv, or via Amazon for the hardcopy versions.
www.oezratty.net/wordpress/2...
Image of a paper titled "Can apparent superluminal neutrino speeds be explained as a quantum weak measurement?" The abstract says simply, "Probably not."
Love this abstract!
from arxiv.org/ftp/arxiv/pa... ๐งช
An all new Econ of AI starter pack (by @arpitrage.bsky.social) : go.bsky.app/DfnDyqb
If there's one thing where BlueSky excels, is at aggregating people that is having fun doing what they love. I can't imagine a list like this (fun economists) on Twitter anymore.
As I'm finally settling in, here are two upcoming events I'm working on right now:
1๏ธโฃ OSOR Handbook Consultation and workshop (12 Dec) interoperable-europe.ec.europa.eu/collection/o...
2๏ธโฃ FOSDEM Devroom on EU Open Source Policy CFP โ Deadline 1 December! softwarefreedom.net/fosdem-25-cfp
Ok, this is my first toe in the water here. And I will start with a question โ any hint for whom to follow that is currently researching the economic impact of GenAI/AI as a horizontal tech? As a start, I looked at @erikbryn.bsky.social profile here and followed most of the people he follows ๐