• 0 Posts
  • 163 Comments
Joined 3 years ago
Cake day: July 5th, 2023



  • AI has an interesting economic trait in that it’s very, very expensive to deploy, and made very fast progress from 2022 to 2024. That caused investors with money to believe that:

    • Pushing the frontier was going to cost a lot of money. More than any other purported revolutionary tech.
    • Extrapolation of past improvement meant that whoever was on the cutting edge may end up with a product with a huge paying market.
    • So whoever wins this race would be rich, and the investment would have been worth it for them.

    But since 2024, we’ve seen that the cutting edge got even more expensive much faster than expected, and much of the improvements in performance now come from inference rather than training, which represents a high ongoing cost.

    Now, if we extrapolate from that trend line, we’ll see that the market will be much smaller for AI services at the cost it takes to provide that service, and the question then becomes whether the industry can make its operations cheaper, fast enough to profitably provide a service people will pay for.

    I have my doubts they’ll succeed, and we might just be looking at the industry like supersonic flight: conceptually interesting, technically feasible, but just a commercial dead end because it’s too expensive.


  • Don’t they put plutonium reactors in space?

    The ones that power spacecraft generate less than 5000W of heat at max power (while producing 300W of usable electricity).

    In order to power a single server rack of 72 Blackwell GPUs, which takes about 130,000 watts, you’d need about 430 of those RTGs, and need to manage cooling requirements of 430 times as much (plus however much additional power will be required by the cooling system itself, too).





I’ve read some of Ed Zitron’s long posts on why the AI industry is a bubble that will never be profitable (and will bring down a lot of companies and investors), and one of the recurring themes is that the AI companies are trying to capture growing market share in an industry where their marginal profits are still negative, and that any increase in revenue necessarily increases their costs of providing their services.

But some of the comments in various HackerNews threads are dismissive, saying that each new generation of models makes the cost of inference lower, so that with sufficient customer volume, the companies running the models can make enough profit on inference to make up for the staggering up-front capital expenditures it took to build out the data centers, train their models, etc.

It’s all pretty confusing to me. So for those of you who are familiar with the industry, I have several questions:

  1. Is the cost of running a pretrained model going down, for any specific model? Are there hardware and software improvements that make it cheaper to run those models, despite the model itself not changing?
  2. Is the cost of performing a particular task at a particular quality level going down, through releases of newer models of similar performance (i.e., a smaller model of the current generation performing similarly to a bigger model of the previous generation, such that the cost is now cheaper)?
  3. Is the cost of running the largest flagship frontier models going down for any given task? Or does running the cutting edge show-off tasks keep increasing in cost, but where the companies argue that the improvement in performance is worth the cost increase?

I suspect that the reason why the discussion around this is so muddled online is because the answers are different depending on which of the 3 questions is meant by “is running an AI model getting cheaper over time?” But I wanted to hear from people who are knowledgeable about these topics.




  • Testing a bunch of linux distros on old intel macbooks has shown me that apple is really good with resource management on their vertically integrated hardware, even with greedy daemons like identityserverd or whatever it is, trolling through your drive cataloguing faces in your photos all the time, and the relentless indexing system, and telemetry.

    It’s really amazing to me how little power MacOS uses in normal use, compared to running Linux on the same machine. The Asahi Linux project also has documented a ton of interesting bits of hardware that MacOS makes use of, pretty seamlessly, that they’ve gotta figure out.




  • AI avatar man wants you to be afraid: “sleeper agents”! “backdoors”! “poisoned documents”! Terrifying!

    It is terrifying. People in positions of power have placed entirely too much trust in these machines that are this easily fooled. I’d argue that we shouldn’t trust these machines as much as they are, but I don’t think the rest of the world is listening enough to these warnings.

    I also worry about how broken search result rankings have gotten. For someone like me who doesn’t use these AI products, it concerns me that actual search engines (which I do use) continue to get worse.

    Sure, there are lessons here for those who build and maintain LLMs, but everyone else should still be terrified at how the world is moving towards, rather than away, this nonsense.


  • It’s really important for people to understand that E2EE cannot protect the message portions that aren’t between the ends themselves. The best encryption in the world can’t help you if the person you’re talking to is an undercover cop, because that “end” can do with the plaintext whatever they want, including record/store/forward the plaintext of any messages they then encrypt and send, or any messages they receive and then decrypt.

    That’s not a flaw of the E2EE protocol itself, but is a limit to the scope of protection that E2EE provides.


  • Here’s the original reporting, instead of another website’s summary of Bloomberg’s actual report:

    https://www.bloomberg.com/news/articles/2026-04-28/us-ends-investigation-into-claims-whatsapp-chats-aren-t-private

    https://archive.is/sGE3e

    So it sounds like the agent was investigating allegations, from content moderation contractors, that Meta could access the contents of WhatsApp messages, and came to the conclusion that yes, Meta could.

    There are a few possibilities here.

    1. Meta does have full plain text access to all Whatsapp messages, but guards that access very closely. Although the clients seem to generate E2EE keys for each session, somehow they’re leaking those keys to Meta’s servers somewhere, and the closed source code sufficiently hides that so that there’s no whistleblower or security researcher able to detect this definitively.
    2. Meta has a secret wiretap functionality where they can compromise the E2EE keys somehow, but uses it only for narrow cases. This helps keep the functionality secret, because security researchers and other reviewers may never see the functionality in action.
    3. Meta allows users to report objectionable content in the threads they’re already part of. The reporting function either forwards the E2EE key itself, or all the plaintext data, that gives content moderators access to the underlying message contents. The contractor whistleblowers and the federal agent investigating these allegations simply got it wrong, and misunderstood the technical process of how the plaintext messages end up in the content moderator’s possession.

    Meta claims that it’s #3. They acknowledge they have plaintext access to messages when a party to the thread presses the report button.

    This unnamed federal agent believes it’s #1, after 10 months of investigation, and sent out an email to other investigators that they should look into that possibility.

    I’m skeptical of #1, simply because I don’t believe that conspiracies to keep that kind of stuff secret can be maintained. It’s not just that there would be technically skilled whistleblowers who have actual access to the code (not the non-technical content moderator contractors who review the content), but a weakness in such an important and widely used protocol would attract all sorts of hackers, state sponsored or otherwise.

    But option #2 might explain everything we’ve seen so far. Full wiretap capability that is rarely used and very tightly controlled.


  • Anybody who believed that quantum computing posed a risk to symmetric encryption was fundamentally misunderstanding how encryption works and what quantum computing might be good at one day.

    Asymmetric cryptography is primarily used for the secure exchanging of symmetric keys: use a public/private key pair to exchange secure messages of what symmetric key to use for their session, and then both sides switch to the symmetric key for actual communication of a real payload.

    A public/private key pair is two keys that have some interesting mathematical relationship, such that it is easy to confirm that someone possesses the right private key using the public key or to encrypt something that only the correct private key can decrypt. And that mathematical relationship, relating to the product of two very large prime numbers, is at the core of modern asymmetric cryptography.

    Quantum computing may make number factorization much, much easier. So once a product of two large primes becomes possible to factor, the public/private key pairs might not be as secure anymore.

    But none of this has anything to do with symmetric encryption, or hash functions. Quantum doesn’t move the needle on that particular math.

    The real risk, though, is for an adversary to eavesdrop on an encrypted key exchange (which uses asymmetric cryptography) and then the message itself (which uses symmetric cryptography) and then be able to take the two steps of getting the secret symmetric key from the intercepted key exchange over a compromised asymmetric protocol, and being able to decrypt the symmetric portion of the communication too.