• 0 Posts
  • 170 Comments
Joined 3 years ago
Cake day: July 5th, 2023





  • The only solution is to make sure they can’t read data you don’t want shared.

    Isn’t that the appropriate guardrail, then? LLM chats and agents and whatever need to be contained with external permissions settings that the LLMs simply do not and can never have the power to override.

    In a normal customer service setting with human agents, there are still plenty of examples of what a human agent simply doesn’t have the power to do. Often, they’ll need to escalate to a manager to do things like process refunds not just because they weren’t given social permission to do so, but because they weren’t given technical permissions to do so. LLM agents need to be contained in the same way. Any decent use of agents, human or software, requires carefully designed processes and permissions extrinsic to that agent’s own decisionmaking abilities to make sure that agents don’t do something bad for the company.






  • AI has an interesting economic trait in that it’s very, very expensive to deploy, and made very fast progress from 2022 to 2024. That caused investors with money to believe that:

    • Pushing the frontier was going to cost a lot of money. More than any other purported revolutionary tech.
    • Extrapolation of past improvement meant that whoever was on the cutting edge may end up with a product with a huge paying market.
    • So whoever wins this race would be rich, and the investment would have been worth it for them.

    But since 2024, we’ve seen that the cutting edge got even more expensive much faster than expected, and much of the improvements in performance now come from inference rather than training, which represents a high ongoing cost.

    Now, if we extrapolate from that trend line, we’ll see that the market will be much smaller for AI services at the cost it takes to provide that service, and the question then becomes whether the industry can make its operations cheaper, fast enough to profitably provide a service people will pay for.

    I have my doubts they’ll succeed, and we might just be looking at the industry like supersonic flight: conceptually interesting, technically feasible, but just a commercial dead end because it’s too expensive.


  • Don’t they put plutonium reactors in space?

    The ones that power spacecraft generate less than 5000W of heat at max power (while producing 300W of usable electricity).

    In order to power a single server rack of 72 Blackwell GPUs, which takes about 130,000 watts, you’d need about 430 of those RTGs, and need to manage cooling requirements of 430 times as much (plus however much additional power will be required by the cooling system itself, too).





I’ve read some of Ed Zitron’s long posts on why the AI industry is a bubble that will never be profitable (and will bring down a lot of companies and investors), and one of the recurring themes is that the AI companies are trying to capture growing market share in an industry where their marginal profits are still negative, and that any increase in revenue necessarily increases their costs of providing their services.

But some of the comments in various HackerNews threads are dismissive, saying that each new generation of models makes the cost of inference lower, so that with sufficient customer volume, the companies running the models can make enough profit on inference to make up for the staggering up-front capital expenditures it took to build out the data centers, train their models, etc.

It’s all pretty confusing to me. So for those of you who are familiar with the industry, I have several questions:

  1. Is the cost of running a pretrained model going down, for any specific model? Are there hardware and software improvements that make it cheaper to run those models, despite the model itself not changing?
  2. Is the cost of performing a particular task at a particular quality level going down, through releases of newer models of similar performance (i.e., a smaller model of the current generation performing similarly to a bigger model of the previous generation, such that the cost is now cheaper)?
  3. Is the cost of running the largest flagship frontier models going down for any given task? Or does running the cutting edge show-off tasks keep increasing in cost, but where the companies argue that the improvement in performance is worth the cost increase?

I suspect that the reason why the discussion around this is so muddled online is because the answers are different depending on which of the 3 questions is meant by “is running an AI model getting cheaper over time?” But I wanted to hear from people who are knowledgeable about these topics.




  • Testing a bunch of linux distros on old intel macbooks has shown me that apple is really good with resource management on their vertically integrated hardware, even with greedy daemons like identityserverd or whatever it is, trolling through your drive cataloguing faces in your photos all the time, and the relentless indexing system, and telemetry.

    It’s really amazing to me how little power MacOS uses in normal use, compared to running Linux on the same machine. The Asahi Linux project also has documented a ton of interesting bits of hardware that MacOS makes use of, pretty seamlessly, that they’ve gotta figure out.