Why I am not impressed by A.I.

joel1974@lemmy.world · 18 days ago

Why I am not impressed by A.I.

Allero@lemmy.today · edit-2 17 days ago

Here’s my guess, aside from highlighted token issues:

We all know LLMs train on human-generated data. And when we ask something like “how many R’s” or “how many L’s” is in a given word, we don’t mean to count them all - we normally mean something like “how many consecutive letters there are, so I could spell it right”.

Yes, the word “strawberry” has 3 R’s. But what most people are interested in is whether it is “strawberry” or “strawbery”, and their “how many R’s” refers to this exactly, not the entire word.

jj4211@lemmy.world · 17 days ago

It doesn’t even see the word ‘strawberry’, it’s been tokenized in a way to no longer see the ‘text’ that was input.

It’s more like it sees a question like: How many 'r’s in 草莓?

And it spits out an answer not based on analysis of the input, but a model of what people might have said.