DeepSeek V3.2 is the #2 most... | Artificial Analysis OKX Feed

DeepSeek V3.2 is the #2 most intelligent open weights model and also ranks ahead of Grok 4 and Claude Sonnet 4.5 (Thinking) - it takes DeepSeek Sparse Attention out of ‘experimental’ status and couples it with a material boost to intelligence @deepseek_ai V3.2 scores 66 on the Artificial Analysis Intelligence Index; a substantial intelligence uplift over DeepSeek V3.2-Exp (+9 points) released in September 2025. DeepSeek has switched its main API endpoint to V3.2, with no pricing change from the V3.2-Exp pricing - this puts pricing at just $0.28/$0.42 per 1M input/output tokens, with 90% off for cached input tokens. Since the original DeepSeek V3 release ~11 moths ago in late December 2024, DeepSeek’s V3 architecture with 671B total/37B active parameters has seen them go from a model scoring a 32 to scoring a 66 in Artificial Analysis Intelligence Index. DeepSeek has also released V3.2-Speciale, a reasoning-only variant with enhanced capabilities but significantly higher token usage. This is a common tradeoff in reasoning models, where more enhanced reasoning generally yields higher intelligence scores and more output tokens. V3.2-Speciale is available via DeepSeek's first-party API until December 15. V3.2-Speciale currently scores lower on the Artificial Analysis Intelligence Index (59) than V3.2 (Reasoning, 66) because DeepSeek's first-party API does not yet support tool calling for this model. If V3.2-Speciale matched V3.2's tau2 score (91%) with tool calling enabled, it would score ~68 on the Intelligence Index, making it the most intelligent open-weights model. V3.2-Speciale uses 160M output tokens to run the Artificial Analysis Intelligence Index, nearly ~2x the number of tokens used by V3.2 in reasoning mode. DeepSeek V3.2 uses an identical architecture to V3.2-Exp, which introduced DeepSeek Sparse Attention (DSA) to reduce the compute required for long context inference. Our Long Context Reasoning benchmark showed no cost to intelligence of the introduction of DSA. DeepSeek reflected this cost advantage of V3.2-Exp by cutting pricing on their first party API from $0.56/$1.68 to $0.28/$0.42 per 1M input/output tokens - a 50% and 75% reduction in pricing of input and output tokens respectively. Key benchmarking takeaways: ➤🧠 DeepSeek V3.2: In reasoning mode, DeepSeek V3.2 scores 66 on the Artificial Analysis Intelligence Index and places equivalently to Kimi K2 Thinking (67) and ahead of Grok 4 (65), Grok 4.1 Fast (Reasoning, 64) and Claude Sonnet 4.5 (Thinking, 63). It demonstrates notable uplifts compared to V3.2-Exp (57) across tool use, long context reasoning and coding. ➤🧠 DeepSeek V3.2-Speciale: V3.2-Speciale scores higher than V3.2 (Reasoning) across 7 of the 10 benchmarks in our Intelligence Index. V3.2-Speciale now holds the highest and second highest scores amongst all models for AIME25 (97%) and LiveCodeBench (90%) respectively. However, as mentioned above, DeepSeek’s first-party API for V3.2-Speciale does not support tool calling and the model gets a score of 0 on the tau2 benchmark. ➤📚 Hallucination and Knowledge: DeepSeek V3.2-Speciale and V3.2 are the highest ranked open weights models on the Artificial Analysis Omniscience Index scoring -19 and -23 respectively. Proprietary models from Google, Anthropic, OpenAI and xAI typically lead this index. ➤⚡ Non-reasoning performance: In non-reasoning mode, DeepSeek V3.2 scores 52 on the Artificial Analysis Intelligence Index (+6 points vs. V3.2-Exp) and is the #3 most intelligent non-reasoning model. DeepSeek V3.2 (Non-reasoning) matches the intelligence of DeepSeek R1 0528, a frontier reasoning model from May 2025, highlighting the rapid intelligence gains achieved through pre-training and RL improvements this year. ➤⚙️ Token efficiency: In reasoning mode, DeepSeek V3.2 used more tokens than V3.2-Exp to run the Artificial Analysis Intelligence Index (from 62M to 86M). Token usage remains similar in non-reasoning variant. V3.2-Speciale demonstrates significantly higher token usage, using ~160M output tokens ahead of Kimi K2 Thinking (140M) and Grok 4 (120M) ➤💲Pricing: DeepSeek has not updated per token pricing for their first-party and all three variants are available at $0.28/$0.42 per 1M input/output tokens Other model details: ➤ ©️ Licensing: DeepSeek V3.2 is available under the MIT License ➤ 🌐 Availability: DeepSeek V3.2 is available via DeepSeek API, which has replaced DeepSeek V3.2-Exp. Users can access DeepSeek V3.2-Speciale via a temporary DeepSeek API until December 15. Given the intelligence uplift in this release, we expect a number of third-party providers to serve this model soon. ➤ 📏 Size: DeepSeek V3.2 Exp has 671B total parameters and 37B active parameters. This is the same as all previous models in the DeepSeek V3 and R1 series

4.2萬

本頁面內容由第三方提供。除非另有說明，OKX 不是所引用文章的作者，也不對此類材料主張任何版權。該內容僅供參考，並不代表 OKX 觀點，不作為任何形式的認可，也不應被視為投資建議或購買或出售數字資產的招攬。在使用生成式人工智能提供摘要或其他信息的情況下，此類人工智能生成的內容可能不準確或不一致。請閱讀鏈接文章，瞭解更多詳情和信息。OKX 不對第三方網站上的內容負責。包含穩定幣、NFTs 等在內的數字資產涉及較高程度的風險，其價值可能會產生較大波動。請根據自身財務狀況，仔細考慮交易或持有數字資產是否適合您。