
At least they moved onto year-based versioning. That was probably the best part about the 26/Tahoe release.

At least they moved onto year-based versioning. That was probably the best part about the 26/Tahoe release.

It’s called decoding and encoding.
But the big data centers doing all the video processing for the big video services (including both permanent videos from a library and things like live streaming) are encoding the videos with settings that require less computational power to decode. The idea is to be able to let even old budget smartphones still be able to display the video with very low power requirements on the client device. There’s no universe where consumers decoding digital video will be a high-power computational task.
Restaurants have sharp knives in the kitchen, but generally serve food that requires only minimal cutting effort from the table knives set out with the rest of the table settings. Dining will always be easier than cooking, by a margin that makes the difficulty of dining not worth mentioning, so it would be bizarre to criticize a knife as being only good for cooking and eating food, when plenty of dining tableware knives out there would be insufficient for kitchen work.
You’ve made the mistake of lumping decoding and encoding together based on the algorithmic/mathematical similarity of those tasks, when everyone else is more inclined to discuss the very different end user use cases of those computing needs.

I don’t think government funding can actually offset the crash in consumer and business demand being insufficient to cover the cost of the most expensive models on the most expensive GPUs. But if you look through my comment history I’ve made the comparison to supersonic flight, because I genuinely believe there’s a possibility that governments fund the expensive branch of this technology for their own military or surveillance or law enforcement purposes without the benefits necessarily actually spilling out into normal commercial applications.
We’ve hit the point where training a model (both pre training and post training) isn’t the expensive part, and the expensive part is actual inference, which makes it hard to scale the most expensive models to where it’s useful for a lot of people. So it might be that the companies and governments that can afford to operate an expensive model might be the only ones to do it. And they’ll be able to, without necessarily the public being able to have access to the same tech.

Plenty of examples of companies spending more than they earn for decades. Before OpenAI and Anthropic, though, nobody has ever needed to raise more than $100 billion from investors before turning a profit, though. The scale is immense, enough to where it affects the liquidity of the investors that have funded their rise.

The business model should be that with economies of scale they could provide compute much cheaper than average consumer can buy to run locally.
That business model assumes that the huge cloud models will always maintain a gap worth paying for, compared to the local models. I’m just not convinced that the average consumer will need cloud models for summarizing their emails or the news of the day.
And for actual costs of their data centers, there literally aren’t enough humans in the world where $20/month AI spending per person will help them break even. They’ll need to sell big accounts (many businesses spending billions per year) in order to break even.

There’s just no way to pay for the cost of these services, though.
When someone constructs a 100 MW data center (now considered a smaller one for new construction), that’s about $2 billion in total costs to outfit the whole operation. And then once it’s on, we’re talking something like $10-20 million/month in electricity alone, and a few million in other costs. How many $20 subscriptions do you need to sell just to break even with your operating expenses? How many $100/month subscriptions do you need to sell to make a dent on your interest payments on the construction? Will there be a market for $1000/month subscriptions from millions of customers? If not, how’s this all going to be paid for?

Once you get into things with useful generation and large context windows, or things like video generation, suddenly you need one or more $10,000+ pieces of hardware to run it.
A Blackwell server with 72 GPUs costs about $3 million, plus requires 130 kW of power (about 3 residential homes’ max rated power through a residential 200A circuit box, for about $600-$1000/day in electricity cost).
You’re gonna need to sell a lot of $20/month subscriptions to get that paid for, assuming that the server is good for 5 years. If it’s only good for 3 years, the economics are basically impossible.

But you’re seeing a screenshot of an unmatched order that no driver has claimed yet. I’m saying that unless an actual match is accepted, that’s not really evidence that people in a place don’t tip well, just that some people don’t get their orders filled.
If you never give less than $5, then any order you’re involved in will involve at least a $5 tip. That may not be representative of the orders you’re not involved with.

I think the user decides how much to tip in advance, and the app conveys that information to potential matches. Orders with low tips tend to sit there unclaimed, because no driver wants to bother with that
I’m not sure if Uber does it that way, but Doordash does.

I remember reading about a case a few years ago where a warehouse couldn’t figure out which of its workers was just periodically taking shits in random corners of the warehouse. I think I’m starting to understand a different angle to that story, though.

It’s gonna be so fucking funny when the push to sell silicon that can run local models at 100 watts or less ends up destroying the business models of the companies that built out 100,000,000,000 watts of data centers.

Yeah, but that’s always been true of paid software licenses for a particular version: it reaches EOL and you have to decide whether to live with the possibility of unpatched known vulnerabilities or pay for an upgrade to a more recent release.
MS Office has been doing this from back in the Windows 3.0 days at least.

The only solution is to make sure they can’t read data you don’t want shared.
Isn’t that the appropriate guardrail, then? LLM chats and agents and whatever need to be contained with external permissions settings that the LLMs simply do not and can never have the power to override.
In a normal customer service setting with human agents, there are still plenty of examples of what a human agent simply doesn’t have the power to do. Often, they’ll need to escalate to a manager to do things like process refunds not just because they weren’t given social permission to do so, but because they weren’t given technical permissions to do so. LLM agents need to be contained in the same way. Any decent use of agents, human or software, requires carefully designed processes and permissions extrinsic to that agent’s own decisionmaking abilities to make sure that agents don’t do something bad for the company.

Edit 2: Nevermind. 13th October is the day Microslop has chosen to fuck me: https://support.microsoft.com/en-us/office/system-requirements/end-of-support-for-office-2021
You’ll still get a few years before the software becomes remotely disabled, though. This story about Office 2019 losing functionality follows Office 2019 losing support in 2023. If that’s the rate things go, then maybe Office 2021 will lose functionality either 2 years from now (7 years after release) or 3 years from now (3 years after losing support).

And, as I understand it, Anthropic hasn’t committed as much spending to building out new data centers, and has setup their operations to be GPU agnostic, so they can keep flexibility between NVIDIA GPUs, Google TPUs, and Amazon Trainium, and play the data center pricing game. Anthropic is better positioned to survive an AI winter (and I believe it’s coming soon).

Not with AI in particular, but yes with subscription based software generally.

The economics of it don’t add up and the growth rate of the curve of improvement over time has already significativelly fallen which looking at the historical curves for other technologies is a very strong indication that it’s approaching the limits of how far it will go even though it’s nowhere close to the hype.
Yeah, I’m convinced that they’ve maintained the illusion of continued exponential improvement from 2024-2026 by sneaking in exponential increase in resources (hardware complexity, power consumption), to prop things up past what should have been a plateau.

AI has an interesting economic trait in that it’s very, very expensive to deploy, and made very fast progress from 2022 to 2024. That caused investors with money to believe that:
But since 2024, we’ve seen that the cutting edge got even more expensive much faster than expected, and much of the improvements in performance now come from inference rather than training, which represents a high ongoing cost.
Now, if we extrapolate from that trend line, we’ll see that the market will be much smaller for AI services at the cost it takes to provide that service, and the question then becomes whether the industry can make its operations cheaper, fast enough to profitably provide a service people will pay for.
I have my doubts they’ll succeed, and we might just be looking at the industry like supersonic flight: conceptually interesting, technically feasible, but just a commercial dead end because it’s too expensive.

Don’t they put plutonium reactors in space?
The ones that power spacecraft generate less than 5000W of heat at max power (while producing 300W of usable electricity).
In order to power a single server rack of 72 Blackwell GPUs, which takes about 130,000 watts, you’d need about 430 of those RTGs, and need to manage cooling requirements of 430 times as much (plus however much additional power will be required by the cooling system itself, too).
Because it obviously was.
The dashes, the short sentences, the bullet points, the overly familiar tone that seems LinkedIn-ish. All of it sounds like AI.