Clean and safe elected a mayor! That really should have been a bigger story.
Clean and safe elected a mayor! That really should have been a bigger story.
This is where bringing in fixed costs will mess you up. If the unit economics on inference are good, then the answer is: more inference.
OK - this is what @davidcrespo.bsky.social and I are trying to tell you. That conflates cost of inference with overall profitability. If you don't separate those, you won't be able to think clearly about questions like, will more usage make the labs go broke?
Yeah, nice - seems like this is the main fallacy.
Thanks for taking the time to understand my argument. Stepping out of the analogy, "gas" is pretty much gonna be the incremental power used by adding a request to a GPU cluster's workload. Have you seen any argument that alone costs more than subscription revenue?
More like, let's use our capacity without overloading it, don't lose money serving the requests, and maximize growth.
Right, though I wouldn't say it as "discounted price." Rather, if it's your job to set the limits on a 'max' subscription product, it's not gonna just be $X/million tokens * Y average million tokens/subscriber * our markup.
well, you know. doubling revenue every couple months oughta do it www.techmeme.com/260303/p46#a...
Tbc what I meant was more that the incremental cost to serve some inference might be free or close to it, not that I think there's no way to reason about it.
They're not gonna be capital efficient right now. But losing world historic amounts of money is just not the same thing as having negative margins on inference or subscription plans.
With all respect, you're saying you have no clue, and you are not thinking in specific enough terms to get one. These are very weird orgs (tech co/research lab hybrids) and they are in hyper growth.
Because my belief is that the actual cost of inference is a) non-public and probably quite closely held, and b) almost certainly varies quite a bit and requires some advanced accounting to work out. You're just not going to use public info to say "aha - the subscription plan loses money, checkmate!"
I'm going to jump in here because I'm genuinely curious. Do you think that's true? If so what do you base your belief on?
Just checking back in on this...
In case this isn't tongue in cheek -- it's not that the problem is more significant, it's that that Knuth guy is sort of a big deal and one of the most respected computer scientists in the history of the field.
Excited to see these candidates though tbc!
I remember. Iβll be honest, I didnβt think she made a particularly strong case for herself. And that was a long time ago, Robβs progressive bona fides looked a lot better back then.
Nosse was also on the committee that wrote measure 110 repeal but afaik never took a stand or engaged with constituents during the process.
Deeply frustrated with how far heβs drifted from what I thought his values were. No primary challenger I guess.
Nosse also does not yet have a primary challenger. Filing deadline is in one week (3/10) - would be great if that district had a rep worth a damn
They identify 3 sources of unreliability (participation rate, selection effects due to pay rate, concurrent agent use) and they think all of those bias towards undercounting.
Maybe they're right, maybe they're not, I don't know, but they clearly their bias is in one direction.
Screenshot of text, with "unreliable signal" highlighted: Unfortunately, given participant feedback and surveys, we believe that the data from our new experiment gives us an unreliable signal of the current productivity effect of AI tools. The primary reason is that we have observed a significant increase in developers choosing not to participate in the study because they do not wish to work without AI, which likely biases downwards our estimate of AI-assisted speedup. We additionally believe there have been selection effects due to a lower pay rate (we reduced the pay from $150/hr to $50/hr), and that our measurements of time-spent on each task are unreliable for the fraction of developers who use multiple AI agents concurrently. Based on conversations with study participants, we believe it is likely that developers are more sped up from AI tools now β in early 2026 β compared to our estimates from early 2025. However, because of the selection effects in our experiment, our data is only very weak evidence for the size of this increase. Our raw results show some evidence for speedup. Our early 2025 study found the use of AI causes tasks to take 19% longer, with a confidence interval between +2% and +39%. For the subset of the original developers who participated in the later study, we now estimate a speedup of -18% with a confidence interval between -38% and +9%. Among newly-recruited developers the estimated speedup is -4%, with a confidence interval between -15% and +9%.
Yeah so in this case the addendum is the rest of the paragraph you are getting "unreliable signal" from?
did you read the text
Not in this case though. The same org did a second round of study with the same participants in 2025 and found that they had become faster, despite some methodology issues they think bias towards undercounting. metr.org/blog/2026-02...
Massive w for Anthropic. Medium term all these companies live or die on recruiting. Whatever the cost turns out to be, you canβt get a bigger recruiting boost than this.
This isn't AI specific. I first encountered it working with restaurants, who also have fixed costs and reserved capacity (a staffed kitchen.) And, they'll often run happy hours at a loss to warm up the kitchen, create a busy room, etc. If the marginal cost of the dish is covered it can make sense.
That creates very different pricing incentives than you'd expect if your mental model is an API that costs X dollars per token. Maybe you price the API so its profitable, but sell subscription plans with generous limits at a lost (as long as the marginal cost is covered).
Buuuut now you've got lots of compute, whether or not you use it. Any time you're under capacity, the marginal cost to serve another million tokens might be zero, or close to it.
If you're an AI lab and you sign, let's say a $30b deal for a certain amount of committed compute over the next few years, that's a big deal. You're burning a spectacular amount of money!
Almost nobody is thinking about this clearly. I haven't seen *any* discourse that understands the difference between total bottom-line cost and marginal cost, either. But my guess would be that's a very important dynamic for every big provider right now!