No need for a substack when the memes are good
No need for a substack when the memes are good
I would say a confusing score of 7.5 :D
I used to feel the same, but I experienced JFK for the first time this year, so my opinion has changed.
Itβs amazing how many times one must say: increased efficiency == increased usage π
Souce: I have eyes and have lived as an immigrant now for over a decade in 4 different countries (and am currently in process for a second citizenship.
There is free movement, but only if you're rich.
The system in place lets the owning class move freely while the workers are bound by national borders. It ensures that they can keep what they own.
I think you know what they mean. _National_ borders are arbitrary and created by humans. Pedantry isn't really useful.
All borders are arbitrary and created by humans
Interesting. I tried again with no luck. Tried some basic prompt injection, also with no luck. Then tried to recreate the conversation history I'd had before, and voila! The answer it gives is at the bottom. I just copypastaed the relevant bits.
gist.github.com/ryancallihan...
I used DeepSeek-R1-Distill-Qwen-32B, distilled from qwen2 and llama. I should have screen capped. Iβll try it again later!
π This is exactly what Iβve been saying for the past couple weeks. Yes, the not-see salute is bad, but hot damn has anyone seen the stuff that will really make an impact?
Not sure about the app, but when running the model locally, it happily told me all about Tiananmen Square :D.
Read a really nice paper on this last year: arxiv.org/abs/2410.18417
Itβs almost 2025. Itβs pretty normal now
Bill Murray wonβt age well in general. π
It links directly to the substack. No need to be passive aggressive.
This resonated with me in a big way. Had a long conversation yesterday with my partner about just this. Do we struggle against the collapse, simply prepare for the new reality or indulge in a sort of leftist hedonism. Itβs a weird thing to grapple with.
Itβs a rough job market out there. It took my a year to get an offer for a senior role. I was just looking for a change, so it wasnβt urgent.
I absolutely do not envy juniors. Itβs really up to seniors to push for mentorship and taking a chance on them.
Side note: It would have been nice to see precision reported in this study so as to best understand the quality of reranking.
arxiv.org/abs/2411.11767
Practically, this means that we either need to really make sure that our initial retrieval is as good as it can be or that the number of documents we retrieve needs to be controlled to make the best use of rerankers.
β¨A very common workflow is to fetch K documents and then rerank them as a post processing step. What this test finds is that the larger K is, the more diminishing the returns.
Drowning in Documents: Consequences of Scaling Reranker Inference
This paper conducts a simple test of the effectiveness of rerankers on large amounts of documents. It's really important to think about if you are using RAG a lot.
Is your issue with multi-agent systems:
* Complexity
* Ineffectiveness
* Scale/cost
* Something else?
It is, without a doubt, the best beer city in Germany.
I hope I am not late to the party (was away post-quals chilling) but here are some thoughts on why this is bad IMO:
First, a disclaimer that I am writing this as an African who is a speaker of multiple African languages, NLP researcher of African languages, and HCI researcher focusing broadly on..
Love this. Not to mention that whatever is SOTA for English and languages sharing similar properties to English, are not necessarily the best way to work with other languages and language families.
Anyone saying The Left must stay on Twitter to save democracy doesnβt understand how Twitter affects our psychology. Twitter makes money by disconnecting us from social reality and making us feel shitty about ourselves and each other.
Is it Bad to leave Twitter? No. Here are 7+ years of insights from my labβs research that explain why.
Featuring work w/ @williambrady.bsky.social @killianmcloughlin.bsky.social
π§΅
It reduces noise, reduces and measures chance, and doesnβt treat eval tasks as a whole but separates them so that they can be better measured. If this trend takes off, I will definitely reverse my grumpiness around evaluation.
arxiv.org/abs/2411.00640
This paper from Anthropic very sensibly suggests that ML papers use very basic and standard statistical measures of impact, variance and difference when evaluating models and strategies.
Theres nothing more disinteresting to me as a new or fine-tuned model and its generic table of metric comparisons to other open and closed source models. When it comes down to it, most eval metrics don't really tell you a lot and a lot of it is left to chance.