Last week, Chinese start-up company Deepseek disrupted the AI market with the launch of its R1 model. Upon the launch of its AI chatbot, the company revealed in a research paper that it spent only $6mn on computing power per training run for the model, which is way below the estimated costs of other popular chatbots, notably OpenAi’s ChatGPT and Google's Gemini. In comparison, Sam Altman, CEO of OpenAI, said that training GPT-4 requires $100mn.
The new program used Nvidia H800 chips made in the year 2000, proving that quality results in AI can be achieved with a smaller budget and fewer and less advanced chips.
Commercially, the cost of using an AI model is usually associated with input and output tokens, the latter being the smallest AI model processing unit. Uploading 1 million tokens into DeepSeek-R1 costs just 55 cents, according to the DocsBot website, while downloading 1 million tokens costs $2.19.
This is compared to $3 and 12$, respectively, for ChatGPT, $5 and $15 respectively, for Grok AI model by Elon Musk's company xAI, and $1.25 and $5, respectively, for Google’s Gemini.
The grades that DeepSeek-R1 receives on a number of industry benchmarks that test subject knowledge, understanding, reasoning capabilities, accuracy and consistency are comparable to the program rivals, with DeepSeek-R1 even outperforming them slightly.