I’m assuming most readers are at least somewhat aware of DeepSeek over the past week, which culminated in a large sell-off in AI stocks on Monday.
I thought it worth putting down some thoughts about where this may go.
Quick background
A small Chinese company, DeepSeek, released a large language model that matches or betters many of the leading Western models. Reported costs are around 10% (i.e. 90% lower) vs similar models. Some are suggesting the end of the AI boom.
Interesting features
- DeepSeek is not from one of the big Chinese IT companies, such as Baidu, Tencent, or Alibaba, which are spending a lot on AI.
- DeepSeek was trained on NVIDIA chips, but far fewer than (say) OpenAI is using. DeepSeek acknowledged that it is limited by chip availability and is actively seeking more computing power.
- The 90% lower figure might be overstated, maybe because of black market NVIDIA chips, or maybe hyperbole. Competitors are suggesting 30-50%. Which is still great! But 2-3x better rather than 10x better.
- The resulting model, which answers the questions, is small enough to run on many home PCs.
- Most of the benefits seem to be from a change of learning methodology. But not a revolutionary change – it has effectively taken and improved the OpenAI model. OpenAI basically did the same thing to Google. All the models are running the same (broad) underlying calculation.
- It is open source. So, everyone else is scrambling to copy the insights into their (mostly) private models.
- Given that it is open source, it is not that difficult to turn off the Chinese censorship of results.
What does this mean for AI spending?
It is hard to see how this is going to decrease AI spending. No CEOs are waking up today and deciding to apply the improvements and then use that to cut costs in their AI department. But spending will be more productive.
DeepSeek has been running largely for free. It probably can’t do that at scale unless the Chinese goverment wants to fund AI hardware. Presumably in order capture the data. Given the US was looking to ban TikTok over user data, it would seem likely that Chinese AI might face similar issues.
Net effect: more productive spending is likely to mean more spending.
Trillion-dollar question for NVIDIA
Will you need the highest-end chips?
Most of the appeal for NVIDIA is that:
a) it has a big lead at the top end of the market
b) in a world of constrained computing power, the top end was so important
c) thus, it seemed NVIDIA’s economic moat would last for a while.
DeepSeek clearly shows that you don’t need the biggest budget to achieve a breakthrough.
The question now is whether data centres will install lower-end chips for much less or stick with the higher-end chips.
My initial guess? Currently, the world can’t build data centres fast enough. Both power and (experienced) labour are the bottlenecks. Given these constraints, it is unlikely that someone would choose to build (say) three data centres with less powerful chips rather than two data centres with the most powerful chips.
Importantly, the new data centres have a different design. It is about networking vast arrays of GPUs. There is nothing in DeepSeek suggesting we go back to old data centre design.
Switching costs are low
You are only as good as your latest model. Many of the service models being built now allow you to change the AI engine quickly and easily. That is going to keep profits low for many of the AI providers (Google, OpenAI, Anthropic).
It is all about services.
Training / Inference mix
First, remember that we are taking really small trends and trying to extend them out years.
I wrote this up last year, when the OpenAI o1 model suggested that inference would be more expensive in future.
Basically, it was a suggestion that queries were going to be a lot more expensive relative to training.
DeepSeek is a step in the other direction, but it is early days. Maybe the techniques OpenAI used for o1 will be added to DeepSeek, and we will be back in the same place.
Open Source?
The open-source nature might suggest that we end up with a few globally trained models that everyone adapts to their own use cases.
Open-source computer languages like Python, Linux, and MySQL dominate usage now. Companies make money by providing services using these languages. DeepSeek is a vote for AI going in the same direction.
Fallout for companies
Apple: Not much. I am a skeptic that AI models will run on people’s phones. My expectation has been that phones will ping servers that will answer the question. Maybe DeepSeek suggests that some more core elements than I thought can be hosted on phones, for example a model that is trained specifically to recognise my voice.
Amazon: not much effect. Best case is that building data centres will be cheaper. Worst case is that China will subside free AI in order to vacuum up Western data and the US does nothing about it.
Alphabet: Similar to Amazon in some respects – they are more of a service provider. However, it is another sign that the former undisputed AI leader has more competition. Plus more concerns about the width of Google’s moat in search.
Microsoft: Similar to Amazon. OpenAI investment may be less valuable.
NVIDIA: See “Trillion-dollar question” above. The narrative is that this is bad for NVIDIA. I’m finding that difficult to reconcile. Maybe there is an argument that we reach “AI saturation” earlier with this announcement. But that is countered by the argument that AI is now more productive and so worth spending more on.
NVIDIA is not ridiculously priced. A “steady-state” semiconductor stock typically will trade at a 20%ish discount to the market. Current prices suggest three or four years of above-average growth and then going to a steady state. Maybe that is right. However, it seems more likely that NVIDIA is looking at least at 5-10 years of above-average growth.
Upshot
Markets have been on a tear for a while and are probably due a pullback.
But, DeepSeek doesn’t seem to be an AI boom-ending event.
I have two analogies which I think are relevant.
In 2015 Leicester City won the English Premier League spending a fraction of other clubs. I think of DeepSeek like Leicester City. The learning, for other clubs, was not to spend less. Nor will it be for the other big AI companies.
More relevantly, around 1999 telecommunications companies were rapidly rolling out internet services. The internet was still too slow for most video applications. A new technology, ADSL meant that the same copper lines could deliver much faster speeds. This made existing infrastructure more efficient, but it did not stop the telco’s spending on faster and faster connections. ASDL helped, but it did not deliver the required speeds. DeepSeek appears the same, it makes existing infrastructure more efficient, but AI appears to have much further to grow.