1bit-Merging
paperDynamic quantized merging framework for large language models that integrates task-specific routing with 1-bit quantized task vectors. Leverages the observation that different task-specific models store knowledge in distinct layers (chat models in attention, math/code in MLP) to balance performance and storage efficiency when merging domain-specific fine-tuned models.
Outputs 1
1bit-Merging: Dynamic Quantized Merging for Large Language Models
paperarXiv: 2502.10743