AngelSlim

Dedicated to building a more intuitive, comprehensive, and efficient LLMs compression toolkit.

📣 Weights | ✒️ AngelSlim Report | 📖 Documentation | 🤗 AngelSlim | 💬 WeChat

model_scores
Hy-MT1.5-1.8B translation quality scores. Source: HY-MT1.5 Technical Report

📣 Latest News

[26/04/29] We have released Hy-MT1.5-1.8B-2bit (574MB) and Hy-MT1.5-1.8B-1.25bit (440MB), on-device translation models supporting 33 languages, with both weights and GGUF formats available.
[26/02/09] We have released HY-1.8B-2Bit, 2-bit on-device large language model.
[26/01/13] We have released v0.3. We support the training and deployment of Eagle3 for all-scale LLMs/VLMs/Audio models. And we released Sherry, the hardware-efficient 1.25-bit quantization algorithm [Paper] | [Code]

For more detailed information, please refer to [AngelSlim] and [HY-MT]

🌟 Hy-MT1.5-1.8B-2bit-GGUF Key Features

World-Class Translation Quality Hy-MT1.5-1.8B-2bit is built upon the Hy-MT1.5-1.8B foundation model, a specialized translation model developed by Tencent Hunyuan Team through a holistic multi-stage training pipeline integrating MT-oriented pre-training, supervised fine-tuning, on-policy distillation, and reinforcement learning. The base model natively supports 33 languages, 5 dialects/minority languages, and 1,056 translation directions. With only 1.8B parameters, it comprehensively outperforms much larger open-source models (e.g., Tower-Plus-72B, Qwen3-32B) and mainstream commercial translation APIs (e.g., Microsoft Translator, Doubao Translator). For full details, please refer to the HY-MT1.5 Technical Report.
Ultra-Compact 2-bit Quantization Hy-MT1.5-1.8B-2bit employs industry-leading Stretched Elastic Quantization (SEQ) to quantize model weights to {-1.5, -0.5, 0.5, 1.5}, combined with quantization-aware distillation. This compresses the original 3.3GB FP16 model down to just 574MB while maintaining near-lossless translation quality that surpasses models hundreds of GBs in size. The quantization details are described in the AngelSlim Technical Report.
On-Device Deployment Optimized for Arm SME2-capable mobile devices (e.g., Apple M4, vivo x300), the 2-bit model enables fast, fully offline translation directly on your phone — no internet connection required. Your data never leaves the device, ensuring complete privacy.

📈 Translation Benchmarks

Performance comparison of different model sizes on the Flores-200 Chinese-Foreign mutual translation benchmark:

flores_model_size
Performance of different model sizes on the Flores-200 Chinese-Foreign mutual translation benchmark.

⚡ Speed Demo

Speed comparison of the 2-bit model on SME2 and Neon kernels:

sme2_2bit_speed
Speed comparison of the 2-bit model on SME2 and Neon kernels.

📱 Demo

We provide a ready-to-use Android demo APK for offline translation. The app features a background word extraction mode that works across any app on your phone — browse emails, webpages, or chat messages and get instant translations without switching apps. No network required, no data collection, one-time download for permanent use.

Download Demo:

https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT-demo.apk

Translation Demo

app_demo
Demo device: Snapdragon 865, 8GB RAM.

Background Word Extraction Mode

demo2
Demo device: Snapdragon 7+ Gen 2, 16GB RAM.

💻 Deployment

Our llama.cpp kernel is coming soon.

📥 Download Links

2-bit model weights: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-2bit
2-bit model GGUF: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-2bit-GGUF
1.25-bit model weights: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit
1.25-bit model GGUF: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF
Demo: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT-demo.apk

📄 Technical Reports

HY-MT1.5 Technical Report: https://arxiv.org/abs/2512.24092
AngelSlim Technical Report: https://arxiv.org/abs/2602.21233
Sherry Paper: https://arxiv.org/abs/2601.07892

📝 License

The code for this project is open-sourced under the License for AngelSlim.

🔗 Citation

@article{angelslim2026,
  title={AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression},
  author={Hunyuan AI Infra Team},
  journal={arXiv preprint arXiv:2602.21233},
  year={2026}
}

@misc{zheng2025hymt,
      title={HY-MT1.5 Technical Report}, 
      author={Mao Zheng and Zheng Li and Tao Chen and Mingyang Song and Di Wang},
      year={2025},
      eprint={2512.24092},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.24092}, 
}

💬 Technical Discussion

AngelSlim is continuously iterating and new features will be released soon. If you have any questions or suggestions, please open an issue on GitHub Issues or join our WeChat discussion group.

tencent/Hy-MT1.5-1.8B-2bit-GGUF