In the days after the US Department of Justice (DOJ) published 3.5 million pages of documents related to the late sex offender Jeffrey Epstein, multiple users on X have asked Grok to “unblur” or remove the black boxes covering the faces of children and women in images that were meant to protect their privacy.

  • Paranoidfactoid@lemmy.world
    link
    fedilink
    English
    arrow-up
    37
    ·
    10 hours ago

    How do these AI models generate nude imagery of children without having been trained with data containing illegal images of nude children?

    • AnarchistArtificer@slrpnk.net
      link
      fedilink
      English
      arrow-up
      29
      ·
      9 hours ago

      The datasets they are trained on do in fact include CSAM. These datasets are so huge that it easily slips through the cracks. It’s usually removed whenever it’s found, but I don’t know how this actually affects the AI models that have already been trained on that data — to my knowledge, it’s not possible to selectively “untrain” models, and they would need to be retrained from scratch. Plus I occasionally see it crop up in the news about how new CSAM keeps being found in the training data.

      It’s one of the many, many problems with generative AI

    • calcopiritus@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      2
      ·
      4 hours ago

      Tbf it’s not needed. If it can draw children and it can draw nude adults, it can draw nude children.

      Just like it doesn’t need to have trained on purple geese to draw one. It just needs to know how to draw purple things and how to draw geese.

      • WraithGear@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        edit-2
        4 hours ago

        that’s not true, a child and an adult are not the same. and ai can not do such things without the training data. it’s the full wine glass problem. and the only reason THAT example was fixed after it was used to show the methodology problem with AI, is because they literally trained it for that specific thing to cover it up.

        • Jarix@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          2 hours ago

          I’m not saying it wasnt trained on csam or defending any AI.

          But your point isn’t correct

          What prompts you use and how you request changes can get same results. Clever prompts already circumvent many hard wired protections. It’s a game of whackamole and every new iteration of an AI will require different methods needed bypass those protections.

          If you can ask it the right ways it will do whatever a prompt tells it to do

          !You can’t tell it to make a nude image of a child, I assume, but you can tell it make the subject in the image of the last prompt 60% smaller and adjust it as necessary to make it believable.!< That probably shouldnt work but I don’t put anything passed these assholes.

          It doesn’t take actual images/data trained if you can just tell it how to get the results you want it to by using different language that it hasn’t been told not to accept.

          The AI doesn’t know what it is doing, it’s simply running points through its system and outputting the results.

          • MathiasTCK@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            20 minutes ago

            It still seems pretty random. So they’ll say they fixed it so it won’t do something, all they likely did was reduce probability, so we still get screenshots showing what it sometimes lets through.

      • slampisko@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 hours ago

        That’s not exactly true. I don’t know about today, but I remember about a year ago reading an article about an image generation model not being able, with many attempts, to generate a wine glass full to the brim, because all the wine glasses the model was trained on were half-filled.

        • calcopiritus@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          4 hours ago

          Did it have any full glasses of water? According to my theory, It has to have data for both “full” and “wine”

          • vala@lemmy.dbzer0.com
            link
            fedilink
            English
            arrow-up
            1
            ·
            2 hours ago

            Your theory is more or less incorrect. It can’t interpolate as broadly as you think it can.

    • Senal@programming.dev
      link
      fedilink
      English
      arrow-up
      3
      ·
      9 hours ago

      Easy answer is , they don’t

      Though that’s just the one admitting to it.

      A lightly more nuanced answer is , it probably depends, there’s likely to be some inference made between age ranges but my guess is that it’d be sub-par given that it sometimes struggles with reproducing images it has a tonne of actual data for.