Vignesh Padmanabhan (@ai-slayer)

A common trend across recent research in using reinforcement learning to train reasoning models is that the clipping operation within a trust region (core to PPO, adopted by GRPO) is squashing rare tokens that are key to clever behaviors like verification or backtracking.

17.06.2025 02:38 👍 35 🔁 6 💬 2 📌 3

Tried creating an AI chatbot on Instagram

01.02.2025 19:52 👍 1 🔁 0 💬 0 📌 0

ML Maestro · AI by a.i.slayer Your AI engineering expert

aistudio.instagram.com/ai/636244858...

01.02.2025 19:52 👍 0 🔁 0 💬 1 📌 0

A high-level summary diagram taken from the slides linked below. It shows the interplay of two main components: a probabilistic model and decision maker or planner.

Probabilistic predictions of an underfitting polynomial classifier on a noisy XOR task and the corresponding under-confident calibration curve.

Probabilistic predictions of an overfitting polynomial classifier and the resulting overconfident calibration curve on the same noisy XOR problem.

Simulation study to show the relative lack of stability of hyperparameter tuning when using hard metrics such as Accuracy or soft yet not probabilistic metrics such as ROC AUC compared to a strictly proper scoring rule such as the log-loss.

I recently shared some of my reflections on how to use probabilistic classifiers for optimal decision-making under uncertainty at @pydataparis.bsky.social 2024.

Here is the recording of the presentation:

www.youtube.com/watch?v=-gYn...

27.11.2024 14:17 👍 49 🔁 19 💬 1 📌 1

GitHub - probabl-ai/skore: Your scikit-learn Modeling Companion Your scikit-learn Modeling Companion. Contribute to probabl-ai/skore development by creating an account on GitHub.

Bringing structure and recommended practices to Machine Learning projects can be challenging. Even experienced data scientists struggle with it.

That's why we built skore – your companion when modeling with scikit-learn. Check it out and let us know what you think!

github.com/probabl-ai/s...

13.12.2024 09:30 👍 11 🔁 4 💬 0 📌 1

Vignesh Padmanabhan (@a.i.slayer) on Threads Which setup would you choose for running large language models locally ? LLMs Option 1: • Apple M4 Max • 14-core CPU, 32-core GPU • 36 GB unified memory • 1 TB SSD Option 2: • Apple M4 Pro • ...

Couldn’t get much opinions here but to conclude this post, got a lot of insights from threads for anyone looking.

www.threads.net/@a.i.slayer/...

26.11.2024 20:16 👍 1 🔁 0 💬 0 📌 0

Which setup would you choose for running large language models (LLMs) locally ?

Option 1:
• Apple M4 Max
• 14-core CPU, 32-core GPU
• 36 GB unified memory
• 1 TB SSD

Option 2:
• Apple M4 Pro
• 14-core CPU, 20-core GPU
• 48 GB unified memory
• 1 TB SSD

25.11.2024 18:31 👍 1 🔁 0 💬 1 📌 0

What is Entropy? This short book is an elementary course on entropy, leading up to a calculation of the entropy of hydrogen gas at standard temperature and pressure. Topics covered include information, Shannon entropy...

Everything you always wanted to ask about entropy but didn't know whom by John Baez.

arxiv.org/abs/2409.09232

24.11.2024 06:48 👍 53 🔁 9 💬 2 📌 0

@bsky.app The translate takes us to a tab and does the conversion. Is there an update which makes it English in place?

23.11.2024 09:39 👍 1 🔁 0 💬 0 📌 0

Love the Starter Pack by @bsky.app .. brilliant idea! Quickly finds all your X & Threads follows in one go!

23.11.2024 09:34 👍 0 🔁 0 💬 0 📌 0

We're always updating the pydata & scipy project starter pack:
go.bsky.app/6HkrMcp

Hello @scikit-learn.bsky.social , @networkx.bsky.social , @scipyconf.bsky.social

22.11.2024 17:46 👍 53 🔁 20 💬 6 📌 1

One of my fav projects: LeanRL, a simple RL library that provides recipes for fast RL training using torch.compile and cudagraphs.
Using these, we got >6x speed-ups compared to the original CleanRL implementations.
github.com/pytorch-labs...

22.11.2024 06:38 👍 33 🔁 5 💬 2 📌 1

A statistical approach to model evaluations A research paper from Anthropic on how to apply statistics to improve language model evaluations

www.anthropic.com/research/sta...

This is an excellent attempt (blog & paper) at bringing more statistical rigor to evaluation of ML models (this is specifically focused on LLM evals).

I feel like we need to have similar clear standards for many types of predictive models in biology. 1/

22.11.2024 08:29 👍 153 🔁 21 💬 4 📌 5

👋

22.11.2024 04:35 👍 0 🔁 0 💬 0 📌 0

Why the Original Transformer Figure Is Wrong, and Some Other Interesting Historical Tidbits About LLMs A few months ago, I shared the article, Understanding Large Language Models: A Cross-Section of the Most Relevant Literature To Get Up to Speed, and the positive feedback was very motivating! So, I also added a few papers here and there to keep the list fresh and relevant.

Just put together a list of papers to highlight 4 interesting things about transformers & LLMs.

Including a discussion on why the original transformer architecture figure is wrong, and a related approach published in 1991!

https://magazine.sebastianraschka.com/p/why-the-original-transformer-figure

25.05.2023 16:12 👍 30 🔁 5 💬 0 📌 0

The Llama 3.2 1B and 3B models are my favorite LLMs -- small but very capable.
If you want to understand how the architectures look like under the hood, I implemented them from scratch (one of the best ways to learn): github.com/rasbt/LLMs-f...

20.11.2024 08:33 👍 141 🔁 16 💬 7 📌 1

How do I get myself added here?

22.11.2024 04:23 👍 0 🔁 0 💬 0 📌 0

Vignesh Padmanabhan

Latest posts by Vignesh Padmanabhan @ai-slayer