AquilaMed-RL

Medical domain language model built on the Aquila foundation through continued pre-training, supervised fine-tuning, and reinforcement learning. Capable of medical triage, medication inquiries, and general medical Q&A. Demonstrates significant win rate against annotated data using GPT-4. SFT and RL training data open-sourced alongside the model.

HuggingFace Paper (arXiv)

biologyopen-weight

Your notes

Related