- Joe the Creator: Insights
- Posts
- Why DeepSeek outperforms ChatGPT with a 97.6% lower budget.
Why DeepSeek outperforms ChatGPT with a 97.6% lower budget.

DeepSeek chatter has gone off the rails, allowing it to become a global phenomenon almost overnight. Its release into popular culture caused multiple Fortune 500 stocks to crash, a slew of buzzword-y “content” on social media, and a lot of misunderstanding around what’s really going on, including people talking about it being able to reprogram itself (SKYNET anyone?).

Image Credit: Ank Kumar (www.twitter.com/ank_kumar)
Is it Chinese hackers? Is it an AI play to overthrow American businesses? People are definitely riding the shock and awe trend to get attention. But hey, that’s the same reason I’m writing this blog… so who am I to judge?
For those of you wondering where I got my math from, it’s the approximate costs each business spent in order to develop and “train” their AI models. ChatGPT, one of the first-to-market AIs, spent $500 million to develop its model, which is now on its ChatGPT-4o version…Full disclosure: I’m a huge fan of it, an “early adopter,” and a self-described lazy human who is happy to use AI to take as many shortcuts as humanly possible.
Let’s talk DeepSeek: using Nvidia’s H800 GPUs—which are considered “mid-range” compared to the high-end chips that OpenAI and Google are using—was essentially forced to train its AI on a micro-budget of $12 million (a whopping 2.4% of ChatGPT’s budget). I want to break down what made this possible and why small teams with small budgets wind up becoming more game-changing than massive organizations with inefficient processes and bloated decision-making.

By Stanford Institute for Human-Centered Artificial Intelligence
The Law of Entropy
ChatGPT's Take:
Entropy suggests that as things grow, they naturally become less efficient unless energy is applied to maintain order and optimize systems.
In a business context:
As a company scales, complexity increases (more employees, processes, and moving parts).
Without intentional organization and refinement, inefficiencies, bureaucracy, and waste accumulate.
Growth itself tends toward disorder unless managed with systems, automation, and continuous improvement.
DeepSeek: Examples in Business and Organizations
Business Growth:
A small startup may operate efficiently with a flat structure and clear communication.
As the company grows, it may add layers of management, departments, and processes, which can lead to bureaucracy, miscommunication, and inefficiency if not carefully managed.
Full disclosure, it actually took me multiple iterations of asking these AIs the “right” questions to get to something applicable to this scenario. In reality, entropy is a term that comes from the Second Law of Thermodynamics, which states that things naturally move toward disorder. I’m sure you’ve seen this first hand if you’ve worked at different-sized businesses.

Is this her first or last day?
Until recently, I wasn’t familiar with the concept of entropy, but I’ve seen this phenomenon play out in extreme ways while working in corporate America. Over my 15-year career, I went from working at a startup to a $3 billion Fortune 500 company within about five years. The difference in agility between the two was shocking.
I remember being in executive meetings at this massive company, knowing in the back of my mind that nothing we were saying would ever get solved. The layers of bureaucracy that needed to be pierced to create change were sky-high. Not to mention, we had almost no goal alignment as department heads—our incentives weren’t aligned with the C-Suite, as I later found out their goal was to get acquired by a competitor (which we were kept in the dark about).
Big businesses allow people to find a “sweet spot”—which is really just the “stay in your lane” spot. In contrast, at the healthcare startup I worked at, I actually saw real changes happen quickly from good ideas. We could measure success, fail fast, pivot, and feel like our work mattered (which is apparently really good for morale 🙃).
This is where DeepSeek shines. Smaller teams = less entropy. Fewer decision-makers mean less bureaucracy, more agility, and a higher likelihood of innovation actually being implemented.
Metric | DeepSeek R1 | ChatGPT |
---|---|---|
Mathematics | 90% accuracy (surpasses GPT-4o) | 83% accuracy on advanced benchmarks |
Coding | 97% success rate in logic puzzles | Top-tier debugging (89th percentile on Codeforces) |
Reasoning | RL-driven step-by-step explanations | Superior multi-step problem-solving |
Multimodal Tasks | Text-only focus | Supports text and image inputs |
Context Window | 128K tokens | 200K tokens |
2) The Second Mover Advantage
ChatGPT's Take:
The Second Mover Advantage is the strategic benefit a company gains by learning from the mistakes and successes of the first mover (the pioneer in an industry or market). Instead of being the innovator that takes all the risks, the second mover refines the model, avoids costly errors, and often outperforms the original.
Why Second Movers Win:
Lower Risk – They let the first mover test the waters, then enter with a proven concept.
Refined Business Model – They can improve upon what works and fix inefficiencies.
Better Technology & Timing – They enter at a stage when tech is more advanced and adoption is easier.
Stronger Branding & Execution – They learn from customer feedback and market reactions to perfect their positioning.
Avoiding High Initial Costs – First movers spend heavily on R&D, marketing, and educating the market. Second movers capitalize on their groundwork.
DeepSeek: Examples of Second Mover Success
Google vs. Early Search Engines:
Google entered the search engine market after Altavista and Yahoo but offered a superior algorithm and user experience, dominating the market.
Facebook vs. MySpace:
Facebook improved on MySpace’s social networking model by focusing on user privacy, cleaner design, and scalability.
Tesla vs. Early Electric Vehicles:
Tesla wasn’t the first electric car company, but it refined the technology and created a luxury brand that redefined the industry.
Zoom vs. Skype:
Zoom entered the video conferencing market after Skype but offered a more reliable and user-friendly platform, especially for businesses.
Being first is expensive. Being first is dangerous. You’re almost guaranteed to have a higher failure rate because the market hasn’t been tested. Industry creators (first movers) bear all the risk, while industry disruptors (second movers) swoop in, take what works, and improve it.
That’s exactly what we’re seeing with DeepSeek vs. ChatGPT.
DeepSeek didn’t have to build the AI ecosystem from scratch. OpenAI, Google, and Anthropic had already laid the foundation. Instead, DeepSeek optimized their model, ran it on cheaper GPUs, and leveraged open-source frameworks to create something revolutionary on a micro-budget. They sidestepped billions in costs by letting OpenAI spend that money first.
Final Thoughts
DeepSeek is a perfect case study of entropy and the second mover advantage in action.
Entropy: DeepSeek's lean team and budget forced them to stay efficient, while OpenAI's massive budget and bureaucracy led to more complexity.
Second Mover Advantage: DeepSeek didn't need to build from scratch. They learned from OpenAI, Google, and others, optimizing AI training at a fraction of the cost.
Little beknownst to most, AI development isn’t just a money game—it’s a strategy game. The winner isn’t always the one who spends the most; sometimes, it’s the one who spends the smartest.
And if history is any indicator, OpenAI should be very, very nervous right now.