High-performance toolkit for compressing, deploying, and serving LLMs at scale.

Library

GitHub Repository

efficiencyframework

Notes

Date approximate.