Ziya LLM
modelChinese-English bilingual LLM series built on LLaMA. Three-stage training: large-scale pre-training, multi-task supervised fine-tuning, and RLHF. Supports translation, programming, classification, extraction, summarization, and math. Ziya2-13B builds on Llama 2 with 650B additional tokens. Ziya-Coding-34B achieves 75.5 HumanEval Pass@1 surpassing GPT-4.
HuggingFace (Ziya-13B-v1)HuggingFace (Ziya2-13B)HuggingFace (Ziya-Coding-34B)GitHub (Fengshenbang-LM)
Outputs 4
Ziya-LLaMA-13B
model13B bilingual LLM based on LLaMA with optimized Chinese tokenizer and three-stage training process including RLHF.
Parameters 13B
Variants
| Name | Parameters | Notes |
|---|---|---|
| Ziya-LLaMA-13B-v1 | — | — |
| Ziya-LLaMA-13B-v1.1 | — | — |
| Ziya-LLaMA-13B-Pretrain-v1 | — | — |
| Ziya-LLaMA-7B-Reward | — | Reward model for RLHF |
Ziya2-13B
modelBuilt on Llama 2 with 650B additional Chinese-English tokens. Full-parameter RLHF with high-quality human preference data.
Parameters 13B
Ziya-Coding-34B
modelCode generation model achieving 75.5 HumanEval Pass@1, surpassing GPT-4's 67.0 score.
Parameters 34B
Ziya-Visual
modelMultimodal vision-language model based on Ziya-LLaMA-13B with visual question answering and dialogue capabilities.
Parameters 14B