OpenAI launched ChatGPT today. It's a conversational interface on top of a GPT-3.5 class model with RLHF fine-tuning. I've been using GPT-3 in production since mid-2020. Here's my honest read.
What's actually new
The model capability is not new. GPT-3.5 is an incremental improvement over GPT-3, meaningfully better but not a step change. Anyone who has been using the API has seen this capability level.
What's new is the interface. A clean, direct conversational experience with no API keys, no prompt engineering, no developer account. You go to the website, type something, get a response. The barrier to experiencing the capability went from "technical enough to get API access" to zero.
This is the change that matters. Not the model. The interface.
Why this one is different
I've been building with AI since 2015. I've watched multiple hype cycles. GPT-2 in 2019, GPT-3 in 2020, DALL-E last year. Each one generated excitement and mostly reached developers and researchers. The general public saw articles, not the thing itself.
ChatGPT is the thing itself, accessible to anyone. The demos don't capture it because the demos have existed before. The direct experience does something different. You have a conversation with it and it's coherent and useful in a way that's hard to dismiss as a party trick.
The RLHF component specifically. Reinforcement learning from human feedback trains the model to be helpful, to follow instructions, to decline certain requests, to clarify ambiguity. The raw GPT-3 model is powerful and difficult to use reliably. The RLHF-tuned version is reliable enough for non-technical users to get value from it without prompt engineering expertise. That's a meaningful product advance regardless of what's happening at the model level.
What I'm watching
How non-technical people use it. The use cases that emerge from the general public using this will be different from what developers have been building. People will find applications that weren't obvious from inside the technical community.
How OpenAI manages the production load. The server is already struggling. Scaling an inference system at this demand curve is a hard engineering problem.
Whether the conversational format reveals limitations that the API format obscured. Longer conversations may expose consistency and memory problems that shorter API interactions didn't surface.
What this means for what we're building
Not much immediately. Countercheck is a computer vision system for a specific physical problem. ChatGPT doesn't change the anti-counterfeiting use case.
Medium term: the expectation of AI fluency is going to increase across our enterprise clients. The appetite for AI-powered features in adjacent workflows (not the core authentication, but reporting, anomaly explanation, configuration assistance) will grow. We should be thinking about where language model capabilities fit into the product before clients start asking.
The honest reaction
I've been close to this technology for years and I still found the experience of using ChatGPT today surprisingly compelling. That's not a common reaction for me at this point. I've been underwhelmed by AI demos for long enough that being genuinely impressed is notable.
The combination of capability and accessibility is the thing. We've had capable models that weren't accessible. We've had accessible interfaces on models that weren't capable enough. ChatGPT is the first time both conditions are true at the same time for a general-purpose language model.
I don't know what the full consequence of that is. I'm paying close attention.
With gusto, Fatih.