An AI safety researcher has quit US firm Anthropic with a cryptic warning that the “world is in peril”.

In his resignation letter shared on X, Mrinank Sharma told the firm he was leaving amid concerns about AI, bioweapons and the state of the wider world.

He said he would instead look to pursue writing and studying poetry, and move back to the UK to “become invisible”.

It comes in the same week that an OpenAI researcher said she had resigned, sharing concerns about the ChatGPT maker’s decision to deploy adverts in its chatbot.

Anthropic, best known for its Claude chatbot, had released a series of commercials aimed at OpenAI, criticising the company’s move to include adverts for some users.

The company, which was formed in 2021 by a breakaway team of early OpenAI employees, has positioned itself as having a more safety-orientated approach to AI research compared with its rivals.

Sharma led a team there which researched AI safeguards.

He said in his resignation letter his contributions included investigating why generative AI systems suck up to users, combatting AI-assisted bioterrorism risks and researching “how AI assistants could make us less human”.

But he said despite enjoying his time at the company, it was clear “the time has come to move on”.

****“The world is in peril. And not just from AI, or bioweapons, but from a whole series of interconnected crises unfolding in this very moment,” Sharma wrote.

He said he had “repeatedly seen how hard it is to truly let our values govern our actions” - including at Anthropic which he said “constantly face pressures to set aside what matters most”.

Sharma said he would instead look to pursue a poetry degree and writing.

He added in a reply: “I’ll be moving back to the UK and letting myself become invisible for a period of time.”****

Those departing AI firms which have loomed large in the latest generative AI boom - and sought to retain talent with huge salaries or compensation offers - often do so with plenty of shares and benefits intact. Eroding principles

Anthropic calls itself a “public benefit corporation dedicated to securing [AI’s] benefits and mitigating its risks”.

In particular, it has focused on preventing those it believes are posed by more advanced frontier systems, such as them becoming misaligned with human values, misused in areas such as conflict or too powerful.

It has released reports on the safety of its own products, including when it said its technology had been “weaponised” by hackers to carry out sophisticated cyber attacks.

But it has also come under scrutiny over its practices. In 2025, it agreed to pay $1.5bn (£1.1bn) to settle a class action lawsuit filed by authors who said the company stole their work to train its AI models.

Like OpenAI, the firm also seeks to seize on the technology’s benefits, including through its own AI products such as its ChatGPT rival Claude.

It recently released a commercial that criticised OpenAI’s move to start running ads in ChatGPT.

OpenAI boss Sam Altman had previously said he hated ads and would use them as a “last resort”.

Last week, he hit back at the advert’s description of this as a “betrayal” - but was mocked for his lengthy post criticising Anthropic.

Writing in the New York Times on Wednesday, former OpenAI researcher Zoe Hitzig said she had “deep reservations about OpenAI’s strategy”.

“People tell chatbots about their medical fears, their relationship problems, their beliefs about God and the afterlife,” she wrote.

“Advertising built on that archive creates a potential for manipulating users in ways we don’t have the tools to understand, let alone prevent.”

Hitzig said a potential “erosion of OpenAI’s own principles to maximise engagement” might already be underway at the firm.

She said she feared this may accelerate if the company’s approach to advertising does not reflect its values to benefit humanity.

BBC News has approached OpenAI for a response.

  • Techlos@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 minute ago

    I’m going to throw my own thoughts in on this. I got into machine learning around 2015, back when relu activations were still bleeding edge innovations, and got out around 2020 for honestly pretty similar reasons.

    Emotions can and have been used as optimisation targets. Engagement is an ever present target. And in the framework of capitalism, one optimisation targets rules above all others; alignment with continued use. It’s part of what leads to the bootlicking LLM phenomenon. For the average human, it drives future engagement.

    The real danger isn’t the newer language models, or anything really to do with neural net architecture; rather, it’s the fact that we’ve found that a simple function minimisation strategy can be used to approximate otherwise intractable functions. The deeper you research, the more clear it becomes that any arbitrary objective can be optimised, given a suitable function approximator and enough data to fit the approximator accurately.

    Human minds are also universal function approximators.

  • BenderRodriguez@lemmy.world
    link
    fedilink
    English
    arrow-up
    48
    ·
    2 hours ago

    A researcher left his high seat, With a warning of global defeat. To the UK he’ll flee, To write poetry, And vanish in shadowy retreat.

  • Hackworth@piefed.ca
    link
    fedilink
    English
    arrow-up
    21
    arrow-down
    2
    ·
    edit-2
    2 hours ago

    FWIW, Anthropic did just fund a pro-regulation super PAC to oppose OpenAI’s/Plantir’s pro-Trump/anti-regulation PAC, and:

    The Pentagon is at odds with artificial-intelligence developer Anthropic over safeguards that would prevent the government from deploying its technology to target weapons autonomously and conduct U.S. domestic surveillance. Reuters

    But I kinda doubt they’ll be able to play the good guy for long.

    • XLE@piefed.social
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      2
      ·
      2 hours ago

      The regulations this PAC promotes are almost laughable. Do they mention CSAM generation? Deepfakes? Pollution? Water table destruction? Suicide encouragement? Nope.

      Those harms are apparently acceptable.

      Instead, they say we should focus on “the nearest-term high risks: AI-enabled biological weapons and cyberattacks.” Sci-fi fiction.

      • Hackworth@piefed.ca
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        42 minutes ago

        They’re advocating for transparency and for states to be able to have their own AI laws. I see that as positive. And as part of that transparency, Anthropic publishes its system prompts, which go through with every message. They devote a significant portion to mental health, suicide prevention, not enabling mania, etc. So I wouldn’t say they see it as “acceptable.”

          • Hackworth@piefed.ca
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            29 minutes ago

            So what I meant by “doubt they’ll be able to play the good guy for long” is exactly that no corpo is your friend. But I also believe perfect is the enemy of good, or at least better. I want to encourage companies to be better, knowing full well that they will not be perfect. Since Anthropic doesn’t make image/video/audio generators, they may just not see CSAM as a directly related concern for the company. A PAC doesn’t have to address every harm to be a source of good.

            As for self-harm, that’s an alignment concern, the main thing they do research on. And based on what they’ve published, they know that perfect alignment is not in our foreseeable future. They’ve made a lot of recent improvements that make it demonstrably harder to push a bot to dark traits. But they know damn well they can’t prevent it without some structural breakthroughs. And who knows if those will ever come?

            I read that 404 media piece when it got posted here, and this is also probably that guy’s fault. And frankly, Dario’s energy creeps me out. I’m not putting Anthropic on a pedestal here, they’re just… the least bad… for now?

  • X@piefed.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 hour ago

    He said he […] move back to the UK to “become invisible”.

    Literally won’t be happening, but okay.

    • excursion22@piefed.ca
      link
      fedilink
      English
      arrow-up
      1
      ·
      27 minutes ago

      Yeah, not really the best place to go to be invisible. However, who knows if that’s actually where he’ll go.

  • hansolo@lemmy.today
    link
    fedilink
    English
    arrow-up
    43
    ·
    3 hours ago

    Translation: “All y’all gonna get sued so hard one day. I’m out, I got paid $74 million last year.”

  • HubertManne@piefed.social
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 hour ago

    All the tasty humans get so paranoid about ai and how it might be trying to hide among them and blend in so it can prey on them one by one. Its like lower your temperature my male siblings!

  • ArgentRaven@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    2 hours ago

    This is literally the plot of Player Piano by Kurt Vonnegut. Interesting that he was able to predict it that far ahead.

  • XLE@piefed.social
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    3
    ·
    2 hours ago

    “AI safety” continues to be a grift to promote AI products.

    Mrinank Sharma of Anthropic should be remembered as a liar for lines like

    The world is in peril. And not just from AI or bioweapons, but from a whole series of interconnected crises unfolding in this very moment

    Despite his letter insisting he’s leaving Anthropic to be more honest, he’s just regurgitating the same propaganda as before, making promises to mislead investors, and advocating for regulations that don’t address any real harms, but will help them monopolize a market.