DeepSeek模型关键创新技术综述(EN).pdf
AReviewofDeepSeekModels’KeyInnovativeTechniques
ChengenWangMuratKantarcioglu
UniversityofTexasatDallasVirginiaTech
chengen.wang@muratk@
Abstract
5
2DeepSeek-V3andDeepSeek-R1areleadingopen-sourceLargeLanguageModels(LLMs)for
0general-purposetasksandreasoning,achievingperformancecomparabletostate-of-the-artclosed-
2sourcemodelsfromcompanieslikeOpenAIandAnthropic—whilerequiringonlyafractionoftheir
rtrainingcosts.UnderstandingthekeyinnovativetechniquesbehindDeepSeek’ssuccessiscrucial
aforadvancingLLMresearch.Inthispaper,wereviewthecoretechniquesdrivingtheremarkable
Meffectivenessandefficiencyofthesemodels,includingrefinementstothetransformerarchitecture,
innovationssuchasMulti-HeadLatentAttentionandMixtureofExperts,Multi-TokenPrediction,
4theco-designofalgorithms,frameworks,andhardware,theGroupRelativePolicyOptimization
1algorithm,post-trainingwithpurereinforcementlearninganditerativetrainingalternatingbe-
tweensupervisedfine-tuningandreinforcementlearning.Additionally,weidentifyseveralopen
]
Gquestionsandhighlightpotentialresearchopportunitiesinthisrapidlyadvancingfield.
L
.Keywords:DeepSeek,Multi-HeadLatentAttention,MixtureofExperts,GroupRelativePolicy
s
cOptimization(GRPO)
[
11Introduction
v
6