StarCoder
model15.5B parameter code LLM trained on 80+ programming languages from The Stack (1T tokens). 8K token context with fill-in-the-middle. Part of the BigCode collaboration between ServiceNow and HuggingFace.
StarCoder outperformed existing open code models and matched Codex on HumanEval. StarCoder2 (February 2024, 3B/7B/15B, 3.3–4.3T tokens) extended the series with NVIDIA, outperforming CodeLlama-34B. OpenRAIL-M License.
Model Details
Architecture DENSE
Parameters 15.5B
Context window 8,192
Paper
arXiv: 2305.06161