See the discussion of this post on Hacker News.
ChatGPT has kicked off a frenzy. It is all anyone in the tech world is talking about it seems. Startups are popping up left and right. Big companies are rapidly releasing ChatGPT-like features integrated in their products.
People are anticipating that large language models are going to revolutionize the world.
And maybe they will.
But a chat bot won't.
Expecting users to primarily interact with software in natural language is lazy.
It puts all the burden on the user to articulate good questions. What to ask, when to ask it, how to ask it, to make sense of the response, and then to repeat that many times.
But a user may not know what they don't know.
A good user interface let's me iteratively and incrementally explore the problem and solution space in a variety of ways.
A great user interface guides me and offers nudges.
Couldn't a natural language interface help with that?
But not as the only option. Probably not even the main interface.
The need to support multiple modalities isn't new—it just seems we are so awestruck by LLMs that new software features are launching that regress to a single modality.
Just slap a textbox on it!
The potential of LLMs goes far beyond a natural language interface.
For example, an application could feed the relevant context to the model behind the scenes and use that to preemptively suggest what I should do next. The toolbar could adapt to my specific task. Dialog boxes wouldn't have to be so static. I could point to a region of the screen and ask for an explanation. It could identify a misunderstanding before I know about it (see my prior work on inquisitive interfaces). The system could show me examples based on what I'm doing. Tutorials could take on a personality that better suits me.
The least it could do is intelligently give me a starting point for typing in a prompt. The tyranny of the blank textbox is real.
Because like my colleague recently said to me:
"people are bad at words"