Back to Models
tencent logo

tencent/HY-MT1.5-1.8B

tencentgeneral


🤗 Hugging Face  |   🕹️ Demo     🤖 ModelScope  |  

🖥️ Official Website  |   Github

Model Introduction

Hunyuan Translation Model Version 1.5 includes a 1.8B translation model, HY-MT1.5-1.8B, and a 7B translation model, HY-MT1.5-7B. Both models focus on supporting mutual translation across 33 languages and incorporating 5 ethnic and dialect variations. Among them, HY-MT1.5-7B is an upgraded version of our WMT25 championship model, optimized for explanatory translation and mixed-language scenarios, with newly added support for terminology intervention, contextual translation, and formatted translation. Despite having less than one-third the parameters of HY-MT1.5-7B, HY-MT1.5-1.8B delivers translation performance comparable to its larger counterpart, achieving both high speed and high quality. After quantization, the 1.8B model can be deployed on edge devices and support real-time translation scenarios, making it widely applicable.

Key Features and Advantages

  • HY-MT1.5-1.8B achieves the industry-leading performance among models of the same size, surpassing most commercial translation APIs.
  • HY-MT1.5-1.8B supports deployment on edge devices and real-time translation scenarios, offering broad applicability.
  • HY-MT1.5-7B, compared to its September open-source version, has been optimized for annotated and mixed-language scenarios.
  • Both models support terminology intervention, contextual translation, and formatted translation.

Related News

  • 2025.12.30, we have open-sourced HY-MT1.5-1.8B and HY-MT1.5-7B on Hugging Face.
  • 2025.9.1, we have open-sourced Hunyuan-MT-7B , Hunyuan-MT-Chimera-7B on Hugging Face.

Performance

You can refer to our technical report for more experimental results and analysis.

Technical Report

 

Model Links

Model NameDescriptionDownload
HY-MT1.5-1.8BHunyuan 1.8B translation model🤗 Model
HY-MT1.5-1.8B-FP8Hunyuan 1.8B translation model, fp8 quant🤗 Model
HY-MT1.5-1.8B-GPTQ-Int4Hunyuan 1.8B translation model, int4 quant🤗 Model
HY-MT1.5-7BHunyuan 7B translation model🤗 Model
HY-MT1.5-7B-FP8Hunyuan 7B translation model, fp8 quant🤗 Model
HY-MT1.5-7B-GPTQ-Int4Hunyuan 7B translation model, int4 quant🤗 Model

Prompts

Prompt Template for ZH<=>XX Translation.


将以下文本翻译为{target_language},注意只需要输出翻译后的结果,不要额外解释:

{source_text}

Prompt Template for XX<=>XX Translation, excluding ZH<=>XX.


Translate the following segment into {target_language}, without additional explanation.

{source_text}

Prompt Template for terminology intervention.


参考下面的翻译:
{source_term} 翻译成 {target_term}

将以下文本翻译为{target_language},注意只需要输出翻译后的结果,不要额外解释:
{source_text}

Prompt Template for contextual translation.


{context}
参考上面的信息,把下面的文本翻译成{target_language},注意不需要翻译上文,也不要额外解释:
{source_text}


Prompt Template for formatted translation.


将以下<source></source>之间的文本翻译为中文,注意只需要输出翻译后的结果,不要额外解释,原文中的<sn></sn>标签表示标签内文本包含格式信息,需要在译文中相应的位置尽量保留该标签。输出格式为:<target>str</target>

<source>{src_text_with_format}</source>

 

Use with transformers

First, please install transformers, recommends v4.56.0

pip install transformers==4.56.0

!!! If you want to load fp8 model with transformers, you need to change the name"ignored_layers" in config.json to "ignore" and upgrade the compressed-tensors to compressed-tensors-0.11.0.

The following code snippet shows how to use the transformers library to load and apply the model.

we use tencent/HY-MT1.5-1.8B for example

from transformers import AutoModelForCausalLM, AutoTokenizer
import os

model_name_or_path = "tencent/HY-MT1.5-1.8B"

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map="auto")  # You may want to use bfloat16 and/or move to GPU here
messages = [
    {"role": "user", "content": "Translate the following segment into Chinese, without additional explanation.\n\nIt’s on the house."},
]
tokenized_chat = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=False,
    return_tensors="pt"
)

outputs = model.generate(tokenized_chat.to(model.device), max_new_tokens=2048)
output_text = tokenizer.decode(outputs[0])

We recommend using the following set of parameters for inference. Note that our model does not have the default system_prompt.

{
  "top_k": 20,
  "top_p": 0.6,
  "repetition_penalty": 1.05,
  "temperature": 0.7
}

 

Supported languages:

LanguagesAbbr.Chinese Names
Chinesezh中文
Englishen英语
Frenchfr法语
Portuguesept葡萄牙语
Spanishes西班牙语
Japaneseja日语
Turkishtr土耳其语
Russianru俄语
Arabicar阿拉伯语
Koreanko韩语
Thaith泰语
Italianit意大利语
Germande德语
Vietnamesevi越南语
Malayms马来语
Indonesianid印尼语
Filipinotl菲律宾语
Hindihi印地语
Traditional Chinesezh-Hant繁体中文
Polishpl波兰语
Czechcs捷克语
Dutchnl荷兰语
Khmerkm高棉语
Burmesemy缅甸语
Persianfa波斯语
Gujaratigu古吉拉特语
Urduur乌尔都语
Telugute泰卢固语
Marathimr马拉地语
Hebrewhe希伯来语
Bengalibn孟加拉语
Tamilta泰米尔语
Ukrainianuk乌克兰语
Tibetanbo藏语
Kazakhkk哈萨克语
Mongolianmn蒙古语
Uyghurug维吾尔语
Cantoneseyue粤语

Citing HY-MT1.5:

@misc{hy-mt1.5,
      title={HY-MT1.5 Technical Report}, 
      author={Mao Zheng and Zheng Li and Tao Chen and Mingyang Song and Di Wang},
      year={2025},
      eprint={2512.24092},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.24092}, 
}
Visit Website

0 reviews

5
0
4
0
3
0
2
0
1
0
Likes1,146
Downloads
📝

No reviews yet

Be the first to review tencent/HY-MT1.5-1.8B!

Model Info

Providertencent
Categorygeneral
Reviews0
Avg. Rating / 5.0

Community

Likes1,146
Downloads

Rating Guidelines

★★★★★Exceptional
★★★★Great
★★★Good
★★Fair
Poor