Seth Kimmel (@sethkim.me)

Really great collaborating with @nathanlile.bsky.social!

Reach out if you're working on synthetic data generation, offline RL, or simulating agentic behavior.

16.07.2025 18:44 👍 1 🔁 0 💬 0 📌 0

Had a lot of fun with this one! Turns out that >1.5% of posts on Hacker News are aviation-related in recent years. www.skysight.inc/blog/hacker-...

03.04.2025 23:38 👍 2 🔁 0 💬 0 📌 0

Model Security with Large-Scale Inference _(Adapted from our talk at the Modal x Mistral Demo Night in San Francisco on March 6th, 2025)_

Gave a fun talk at the @modal-labs.bsky.social and Mistral AI
demo night last week in SF! We discussed open-source model security, applying some of the large-scale inference techniques we've been working on recently.

www.skysight.inc/blog/model-s...

13.03.2025 19:51 👍 0 🔁 0 💬 0 📌 0

a man with a beard and a scarf around his head says lisan al gaib on the bottom ALT: a man with a beard and a scarf around his head says lisan al gaib on the bottom

19.11.2024 23:51 👍 9 🔁 0 💬 0 📌 0

Do they have to nerd snipe us this bad?

19.11.2024 23:46 👍 6 🔁 0 💬 0 📌 0

show us the way

19.11.2024 23:42 👍 7 🔁 0 💬 1 📌 0

I've always wanted to like python notebooks

19.11.2024 23:39 👍 4 🔁 0 💬 1 📌 0

🥵 #dataBS

19.11.2024 02:01 👍 5 🔁 0 💬 0 📌 0

Thank you @jakthom.bsky.social!

18.11.2024 19:45 👍 1 🔁 0 💬 1 📌 0

Anyone want to make #graphBS a thing?

14.11.2024 20:11 👍 1 🔁 0 💬 0 📌 0

Depressing will be the day a developer asks me:

"What is StackOverflow?"

14.11.2024 19:28 👍 1 🔁 0 💬 1 📌 0

It seems pretty clear to me that this thing will be as big/bigger than X in terms of sheer number of users

14.11.2024 06:19 👍 1 🔁 0 💬 0 📌 0

Do you think this is different than human confidence? I'd say 99.9% of what we think is true is second-hand knowledge

13.11.2024 19:03 👍 0 🔁 0 💬 1 📌 0

Using logprobs | OpenAI Cookbook Open-source examples and guides for building with the OpenAI API. Browse a collection of snippets, advanced techniques and walkthroughs. Share your own examples and guides.

Somewhat disagree here. Have you ever looked at logprobs? The model far prefers steering in directions that it feels confident in given alternatives. cookbook.openai.com/examples/usi...

13.11.2024 16:59 👍 0 🔁 0 💬 1 📌 0

So do humans! It's why we have QA/testing, and jobs that are just pure oversight.

You might expect both an LLM and a human to get a handful of data labeling tasks wrong, but have it checked with a verifier/adversarial LLM and you'll likely get ~100% accuracy.

13.11.2024 16:54 👍 0 🔁 0 💬 0 📌 0

That being said, my guess is progress will resume when we start to generate high-quality, focused synthetic data. Sort of like forcing a human to go to the library and acquire actual knowledge instead of scrolling on social media all day

13.11.2024 03:48 👍 1 🔁 0 💬 0 📌 0

How can we be surprised that LLM scaling laws don't hold when the training data is literally just crap people write on the internet?

13.11.2024 03:46 👍 1 🔁 0 💬 0 📌 1

Btw the bait is me just saying there is no debate

13.11.2024 01:01 👍 1 🔁 0 💬 0 📌 0

Okay now that everyone is here who wants to get baited into an R vs. Python debate?

13.11.2024 01:01 👍 0 🔁 0 💬 0 📌 1

Seems like the holiday is bringing huge amounts of new users over here. @jaz.bsky.social is your data supporting this?

11.11.2024 23:05 👍 0 🔁 0 💬 0 📌 0

Who is going to build the billion dollar slop/non-slop classifier?

11.11.2024 05:15 👍 2 🔁 0 💬 0 📌 0

While a lot of the content on X is clearly written by humans, it's sort of degraded into subhuman patterns of engagement. Lots of cryptic speak, baiting, trolling, inflaming, etc. Glad this place has actual humans posting actual human thoughts.

11.11.2024 05:05 👍 0 🔁 0 💬 0 📌 0

This is super cool! One has to assume that the most open, programmable, and hackable content platforms win in the long run

09.11.2024 20:44 👍 2 🔁 0 💬 0 📌 0

Becoming a more confident engineer isn't about writing less dumb code; it's about accepting the fact that everyone else's code is just as dumb as yours

08.11.2024 21:54 👍 0 🔁 1 💬 0 📌 0

GenAI - pay more for inefficient ML models
Crypto - pay more for others to verify your own transactions
SaaS - pay more for something a spreadsheet can do
Cloud - pay more for someone else’s computer
Mobile - pay more for apps that can run in a browser

08.11.2024 18:21 👍 1 🔁 0 💬 0 📌 0

I'm a big fan of the "anti" data warehouse approach! Users shouldn't be forced to store their data in a third-party system to get the benefits of its processing capabilities.

04.11.2024 22:38 👍 4 🔁 0 💬 0 📌 0

People forget that it's not unusual for Apple to release products that initially suck and are iteratively refined. They take big bets.

The original iPhone, Apple Maps, etc.

My guess is Apple Intelligence will have a dominant, frontier consumer AI product within ~5 years.

03.11.2024 23:00 👍 0 🔁 0 💬 0 📌 0

Startup idea: Cursor, but it just shits on your patterns and bullies you into refactoring your entire codebase every time you ask it a question.

03.11.2024 22:53 👍 5 🔁 0 💬 0 📌 0

4. The notion of consensus will be a lot more important, and agentic moderators might be in charge of modifying embedding indexes to more accurately represent reality and remove hallucinations/biases

31.10.2024 21:01 👍 0 🔁 0 💬 0 📌 0

3. APIs and internet-based services might not be as rigid. An LLM can more freely negotiate with a service provider if there request doesn't conform to a certain standard.

31.10.2024 20:56 👍 0 🔁 0 💬 0 📌 0

Seth Kimmel

Latest posts by Seth Kimmel @sethkim.me