2023: the year the ceiling came off

A practitioner's year in review. Not a predictions post. What actually changed, what's oversold, and what I'm watching.

What actually changed

The interface layer. ChatGPT was November 2022. The consequence landed in 2023. Every enterprise client we work with has had a board-level conversation about AI this year. The language model capability was there in 2020 with GPT-3. The general awareness of it was 2023. That gap closing matters more than any model release.

The context window. Claude 2 at 100K tokens, GPT-4-32K, and the trend toward longer contexts changed what's buildable. The tasks that required chunking, summarization pipelines, and lossy compression to fit model context can now be done in a single pass. This is a quiet improvement that the benchmarks don't capture and that has materially changed our internal tooling.

Multi-modal. GPT-4V in October. The barrier between language models and visual tasks is lowering. The practical applications are limited today by latency and cost. The direction is clear.

Code generation quality. The gap between what models can generate and what engineers write has narrowed enough that the workflows have changed, not just the tooling. We use AI-assisted code review, AI-assisted sprint planning, AI-assisted documentation. These are not experiments. They're operational.

What's oversold

Autonomous agents. The demos are impressive. The production reliability is not there. We've run experiments with tool-using agents for internal tasks. They work on well-constrained tasks and fail unpredictably on ambiguous ones. The failure modes are hard to anticipate and harder to debug. This will improve. It's not ready for anything where failures have real consequences.

AI replacing roles wholesale. Every productivity tool changes how work gets done. It doesn't usually eliminate the need for people with good judgment about what the work should be. The engineers using AI assistance are more productive. The product managers who understand what to build are still necessary. The ratio may shift. The complete replacement narrative is ahead of the evidence.

"AI-native" as a category. Every startup is calling itself AI-native. Most of them mean they use the API. The genuinely AI-native products are the ones where the product design is impossible without the AI capability. That's a smaller set.

What I'm watching in 2024

Inference cost. The cost per token has dropped significantly this year. If it continues to drop, the economics of running LLMs in production change materially. Features that are expensive to run become cheap. The product design space expands.

Compliance infrastructure. The European AI regulatory landscape is solidifying. The enterprise clients who couldn't use external LLM APIs because of data handling constraints will either get clearer on-premise options or will remain gated from the capability. This is the biggest blocker for AI adoption in our market.

Smaller, specialized models. The assumption that bigger is better is getting challenged. Models fine-tuned on specific domains are outperforming larger general models on domain tasks at a fraction of the inference cost. This changes the build-versus-buy calculation for AI features in specialized products.

What I know for certain

I've been building with AI since 2015. This year was different from every previous year in one specific way: the general awareness caught up to the technical capability. That creates opportunity and noise in equal measure. The people who will do the best work in 2024 are the ones who can tell them apart.

With gusto, Fatih.