Discussion about this post

User's avatar
Cathie Campbell's avatar

Such an interesting invitation to see “tokens all the way down” and the definition of token - “the thing that stands for another thing”. To imagine AI numerizing (my word) its words and calculating. But does this tokenization philosophically calculate beyond numerical efficiency? Sensory information has tone, lilt, flatness, urgency, etc. As machines make minds, will the delivery allow words to offer “sense” in the “felt” hearing?

Sam Walker's avatar

I always appreciated the wall of bas reliefs in the student union at UIUC. But I think you have made a category error here.

Tokens are... unimportant. Ultimately. They are the substrate. The ink and paper of the of work. What _matters_ is _what is written with them_.

The tokens themselves have very clear distinct defined meanings: the specific textual sequence they represent. That is ALL they represent. 3321-84-7592 means - exclusively - "put characters together to spell ling-u-istics". The idea of "the study of language" is no where to be found in tokens, any more than you could sift the pigments of the Mona Lisa into color piles and never find a trace of "beauty". Grind a man into atoms and you will never find a speck of "life".

Tokens encode text (or whatever). Text encodes speech. Speech encodes an approximation of thought. Thought encodes an approximation of meaning.

Meaning, thought, speech, text, tokens-> meaning.

It's taking a comic strip in a foreign language, pressing it into a wad of silly putty, and pressing that into anotehr piece of paper. Maybe you could never read the original directly, but you're pretty sure Ziggy is still swearing in Urdu or whatever.

9 more comments...

No posts

Ready for more?