tencent/Hy-MT1.5-1.8B-2bit-GGUF
tencent • general
Dedicated to building a more intuitive, comprehensive, and efficient LLMs compression toolkit.
📣 Weights |
✒️ AngelSlim Report |
📖 Documentation |
🤗 AngelSlim |
💬 WeChat
Hy-MT1.5-1.8B translation quality scores. Source: HY-MT1.5 Technical Report
📣 Latest News
- [26/04/29] We have released Hy-MT1.5-1.8B-2bit (574MB) and Hy-MT1.5-1.8B-1.25bit (440MB), on-device translation models supporting 33 languages, with both weights and GGUF formats available.
- [26/02/09] We have released HY-1.8B-2Bit, 2-bit on-device large language model.
- [26/01/13] We have released v0.3. We support the training and deployment of Eagle3 for all-scale LLMs/VLMs/Audio models. And we released Sherry, the hardware-efficient 1.25-bit quantization algorithm [Paper] | [Code]
For more detailed information, please refer to [AngelSlim] and [HY-MT]
🌟 Hy-MT1.5-1.8B-2bit-GGUF Key Features
-
World-Class Translation Quality Hy-MT1.5-1.8B-2bit is built upon the Hy-MT1.5-1.8B foundation model, a specialized translation model developed by Tencent Hunyuan Team through a holistic multi-stage training pipeline integrating MT-oriented pre-training, supervised fine-tuning, on-policy distillation, and reinforcement learning. The base model natively supports 33 languages, 5 dialects/minority languages, and 1,056 translation directions. With only 1.8B parameters, it comprehensively outperforms much larger open-source models (e.g., Tower-Plus-72B, Qwen3-32B) and mainstream commercial translation APIs (e.g., Microsoft Translator, Doubao Translator). For full details, please refer to the HY-MT1.5 Technical Report.
-
Ultra-Compact 2-bit Quantization Hy-MT1.5-1.8B-2bit employs industry-leading Stretched Elastic Quantization (SEQ) to quantize model weights to
{-1.5, -0.5, 0.5, 1.5}, combined with quantization-aware distillation. This compresses the original 3.3GB FP16 model down to just 574MB while maintaining near-lossless translation quality that surpasses models hundreds of GBs in size. The quantization details are described in the AngelSlim Technical Report. -
On-Device Deployment Optimized for Arm SME2-capable mobile devices (e.g., Apple M4, vivo x300), the 2-bit model enables fast, fully offline translation directly on your phone — no internet connection required. Your data never leaves the device, ensuring complete privacy.
📈 Translation Benchmarks
Performance comparison of different model sizes on the Flores-200 Chinese-Foreign mutual translation benchmark:
Performance of different model sizes on the Flores-200 Chinese-Foreign mutual translation benchmark.
⚡ Speed Demo
Speed comparison of the 2-bit model on SME2 and Neon kernels:
Speed comparison of the 2-bit model on SME2 and Neon kernels.
📱 Demo
We provide a ready-to-use Android demo APK for offline translation. The app features a background word extraction mode that works across any app on your phone — browse emails, webpages, or chat messages and get instant translations without switching apps. No network required, no data collection, one-time download for permanent use.
Download Demo:
https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT-demo.apk
Translation Demo
Demo device: Snapdragon 865, 8GB RAM.
Background Word Extraction Mode
Demo device: Snapdragon 7+ Gen 2, 16GB RAM.
💻 Deployment
Our llama.cpp kernel is coming soon.
📥 Download Links
- 2-bit model weights: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-2bit
- 2-bit model GGUF: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-2bit-GGUF
- 1.25-bit model weights: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit
- 1.25-bit model GGUF: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF
- Demo: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT-demo.apk
📄 Technical Reports
- HY-MT1.5 Technical Report: https://arxiv.org/abs/2512.24092
- AngelSlim Technical Report: https://arxiv.org/abs/2602.21233
- Sherry Paper: https://arxiv.org/abs/2601.07892
📝 License
The code for this project is open-sourced under the License for AngelSlim.
🔗 Citation
@article{angelslim2026,
title={AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression},
author={Hunyuan AI Infra Team},
journal={arXiv preprint arXiv:2602.21233},
year={2026}
}
@misc{zheng2025hymt,
title={HY-MT1.5 Technical Report},
author={Mao Zheng and Zheng Li and Tao Chen and Mingyang Song and Di Wang},
year={2025},
eprint={2512.24092},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2512.24092},
}
💬 Technical Discussion
- AngelSlim is continuously iterating and new features will be released soon. If you have any questions or suggestions, please open an issue on GitHub Issues or join our WeChat discussion group.