

LLM is being used in a colloquial way here. It’s just how the algorithm is arranged. Tokenize input, generate output by stacking the most likely subsequent tokens, etc.
It still differentiates it from neural networks and other more basic forms of machine “learning” (god what an anthropomorphized term from the start…).
Yes. Yes implying plurality for a singular thing is, by definition, exaggerating.