DeepSeek-Coder
model paperCoding-specialized models (1.3B to 33B parameters) trained on 2 trillion tokens, with an accompanying technical report.
Outputs 2
DeepSeek-Coder
model Architecture DENSE
Variants
| Name | Parameters | Notes |
|---|---|---|
| DeepSeek-Coder-1.3B | 1.3B | — |
| DeepSeek-Coder-6.7B | 6.7B | — |
| DeepSeek-Coder-33B | 33B | — |
DeepSeek-Coder: When the LLM Meets Programming
paperTechnical report on the rise of code intelligence with DeepSeek-Coder models.
arXiv: 2401.14196