Discussion about this post

User's avatar
Neural Foundry's avatar

Exceptional breakdown of DeepSeek V3.2's strategic positioning in the compute landscape. Your observation that they chose to scale existing architecture rather than chase novelty is precisely what makes this release so instructive. The DSA framework's ability to maintain performance while reducing computational complexity demonstrates that efficiency gains remain underexploited across the field. What strikes me most is their candid admission about world knowledge gaps stemming from lower training FLOPs, which reframes the entire open vs. closed source debate. The widening performance gap you documented isn't a failure of open source models, it's a compute allocation story. Their post-training budget exceeding ten percent of pre-training cost is particulary striking because most labs don't publish those ratios, making it hard to benchmark where resources should flow. The agent generalization work with 1,800 environments also signals a methodological shiftfrom task-specific tuning toward general capability development that could reshape how teams approach deployment.

Expand full comment
Austin Nellessen's avatar

Hi! This is a fascinating read, and hopefully someday I can learn to read published papers half as well as you can!

Quick question: When you cite DeepSeek’s comments that proprietary (closed-source) models are accelerating their progress beyond open-source, why do you think this may be the case? What is the difference between closed/open source that would give this edge—other than perhaps funding and chips?

Again, loved the article, and the name “geopolitechs“!

Expand full comment

No posts

Ready for more?