Huh. Looks like Plato was right.
A new paper shows all language models converge on the same "universal geometry" of meaning. Researchers can translate between ANY model's embeddings without seeing the original text.
Implications for philosophy and vector databases alike. arxiv.org/pdf/2505.12540
23.05.2025 02:44
π 254
π 45
π¬ 9
π 13
AI Agents vs. Agentic #AI: A Conceptual Taxonomy, Applications and Challenges (preprint) arxiv.org/abs/2505.10468
17.05.2025 13:55
π 6
π 4
π¬ 0
π 0
The White House has begun process of looking for new secretary of defense
The White House has begun the process of looking for a new secretary of defense, according to a U.S. official who was not authorized to speak publicly.
BREAKING NEWS: The White House has begun the process of looking for a new secretary of defense, according to a U.S. official who was not authorized to speak publicly.
21.04.2025 17:25
π 37469
π 7786
π¬ 3808
π 3104
They show LMs can synthesize their own thoughts for more data-efficient pretraining, bootstrapping their capabilities on limited, task-agnostic data. They call this new paradigm βreasoning to learnβ.
27.03.2025 03:54
π 42
π 7
π¬ 1
π 1
PapersChat β Chat with Research Papers
PapersChat provides an agentic AI interface for querying papers, retrieving insights from ArXiv & PubMed, and structuring responses efficiently.
github.com/AstraBert/Pa...
10.03.2025 04:47
π 35
π 4
π¬ 0
π 1
French Senator Claude Malhuret:
"Washington has become Neroβs court, with an incendiary emperor, submissive courtiers and a jester high on ketamine... We were at war with a dictator, we are now at war with a dictator backed by a traitor."
05.03.2025 15:47
π 81747
π 25786
π¬ 1670
π 2826
A few words on DeepSeek new releases. Links are:
- github.com/deepseek-ai/...
- github.com/deepseek-ai/...
- github.com/deepseek-ai/...
and the Ultra-Scale Playbook at huggingface.co/spaces/nanot...
27.02.2025 13:41
π 51
π 5
π¬ 0
π 1
Just read the s1: Simple Test-Time Scaling paper. Super interesting approach to improving reasoning models!
TL;DR:
1. SFT on 1k curated examples w/ reasoning traces.
2. Control response length w/ budget forcing:
"Wait" tokens β longer reasoning/self-correction.
"Final Answer:" β enforce stopping.
07.02.2025 13:59
π 38
π 6
π¬ 2
π 1
Maybe a hot take, but what about the following advice to the next gen:
Don't get an AI degree; the curriculum will be outdated before you graduate. Instead, study math, stats, or physics as your foundation, and stay current with AI through code-focused books, blogs, and papers.
09.02.2025 15:36
π 147
π 22
π¬ 12
π 7
A herd of bison stretching off into the distance on a snowy prairie.
Bison should be allowed to roam free and cattle should be restricted to private land.
All abandoned barbed wire should be removed from public land.
The money today being wasted on public lands grazing should go into building wildlife overpasses and installing wildlife safe guide fencing.
07.02.2025 16:46
π 7667
π 1302
π¬ 167
π 67
no pun intended but βAttention is all you needβ
29.01.2025 17:52
π 1
π 0
π¬ 0
π 0
Not one VC would ever fund a startup to do the kind of hardcore optimization work that DeepSeek did.
Every VC firm should be asking themselves why.
28.01.2025 05:00
π 105
π 11
π¬ 5
π 2
Havenβt we been doing the same to Google and Facebook for the past 15 years?
27.01.2025 03:02
π 2
π 0
π¬ 0
π 0
Mastering Tensor Dimensions in Transformers
A Blog post by Hafedh Hichri on Hugging Face
This is a wonderfully simple blog on how tensors flow through a transformer model.
Covering:
- Tokenize
- Embed
- Positional Encoding
- Decoder
- Multi-Head Attention
- Add and normalize
- Feed-Forward
- Model Head
- Cross-Attention
Blog:
14.01.2025 13:00
π 30
π 4
π¬ 1
π 0
Free Our Feeds! What is it! @freeourfeeds.com
F.O.F. is an independent group with the goal of running THISπ social network totally outside of Bluesky.
It's not us. It's a fully independent version of the network. All the same users and posts. Running cooperatively with us and others.
13.01.2025 21:02
π 1842
π 414
π¬ 56
π 55
If youβre an AI startups, or interviewing w/ one ask:
What are you the best in the world at?
Do you offer a service, formula, or delivery method you invented?
Is there something you do thatβs patentable or a unique user experience?
Have you identified and isolated a market segment?
If not, walk
05.01.2025 22:33
π 22
π 3
π¬ 0
π 0
Happy new year 2025
01.01.2025 18:53
π 2
π 0
π¬ 0
π 0
Very interesting paper by Ananda Theertha Suresh et al.
For categorical/Gaussian distributions, they derive the rate at which a sample is forgotten to be 1/k after k rounds of recursive training (hence π¦π¨πππ₯ ππ¨π₯π₯ππ©π¬π happens more slowly than intuitively expected)
27.12.2024 23:35
π 35
π 5
π¬ 1
π 0
lol wait until they realize Vivek is Indian as well
26.12.2024 17:04
π 1
π 0
π¬ 0
π 0
Aranym/40-million-bluesky-posts Β· Datasets at Hugging Face
Weβre on a journey to advance and democratize artificial intelligence through open source and open science.
Releasing a dataset of 40 million Bluesky posts!
Collected using the Firehose API, I hope people do some cool ML with it.
Anonymized with a data removal mechanism and includes text, language predictions, and image data.
#ai #ml #NLP
huggingface.co/datasets/Ara...
17.12.2024 15:25
π 6
π 1
π¬ 1
π 0
only bay area residents have an exclusive right to refer to Sf as the city
24.12.2024 20:13
π 2
π 0
π¬ 0
π 0
Eugene Vinitsky
A short list of tips for keeping a clean, organized ML codebase for new researchers: eugenevinitsky.com/posts/quick-...
18.12.2024 20:00
π 135
π 30
π¬ 12
π 3
LLM Research Papers: The 2024 List
A curated list of interesting LLM-related research papers from 2024, shared for those looking for something to read over the holidays.
Hey all, I've been a bit quiet the last couple of weeks as I am recovering from an accident & injury.
Unfortunately, I couldnβt write my yearly AI research review this year, but hereβs at least a list of bookmarked papers you might find useful: magazine.sebastianraschka.com/p/llm-resear...
22.12.2024 14:02
π 109
π 9
π¬ 15
π 1
Title card: Alignment Faking in Large Language Models by Greenblatt et al.
New work from my team at Anthropic in collaboration with Redwood Research. I think this is plausibly the most important AGI safety result of the year. Cross-posting the thread below:
18.12.2024 17:46
π 126
π 29
π¬ 5
π 11
it really depends on the type of spice , cumin or coriander pre-sautΓ©, if itβs Garam masala post sautΓ© to preserve the aroma
10.12.2024 01:23
π 6
π 0
π¬ 0
π 0
LLMs might secretly be world models of the internet!
By treating LLMs as simulators that can predict "what would happen if I click this?" the authors built an AI that can navigate websites by imagining outcomes before taking action, performing 33% better than baseline. arxiv.org/pdf/2411.06559
03.12.2024 02:00
π 87
π 9
π¬ 3
π 1