Coding-specialized models (1.3B to 33B parameters) trained on 2 trillion tokens, with an accompanying technical report.

Outputs 2

DeepSeek-Coder

model
Architecture DENSE

Variants

Name Parameters Notes
DeepSeek-Coder-1.3B 1.3B
DeepSeek-Coder-6.7B 6.7B
DeepSeek-Coder-33B 33B

DeepSeek-Coder: When the LLM Meets Programming

paper

Technical report on the rise of code intelligence with DeepSeek-Coder models.

arXiv: 2401.14196

codingopen-weight