HyperCLOVA
modelKorean GPT-3 variant trained on 560B tokens. The arxiv paper covers the 82B model; press reports indicate a 204B variant was also trained but not published. One of the earliest large-scale Korean language models.
Model Details
Architecture DENSE
Parameters 204B
Variants
| Name | Parameters | Notes |
|---|---|---|
| HyperCLOVA-82B | 82B | Covered in arxiv paper |
| HyperCLOVA-204B | 204B | Reported in press; not published as paper |
Paper
arXiv: 2109.04650