Qwen3-1.7B-TTS-Cross-Darwin-NOESIS-AWQ-INT4

Name: AMAImedia/Qwen3-1.7B-TTS-Cross-Darwin-NOESIS-AWQ-INT4
Brand: AMAImedia
Rating: 0.0 (14 reviews)

AWQ INT4 quantization of FINAL-Bench/Darwin-TTS-1.7B-Cross optimized for low-VRAM consumer hardware (RTX 3060 6 GB).

Released as part of the NOESIS Professional Multilingual Dubbing Automation Platform (framework: DHCF-FNO — Deterministic Hybrid Control Framework for Frozen Neural Operators).

Founder: Ilia Bolotnikov
Organization: AMAImedia.com
X (Twitter): @AMAImediacom
LinkedIn: Ilia Bolotnikov
Telegram: @djbionicl
NOESIS version: v14.7
Release date: 2026-04

⚠️ License notice

This model is derived from FINAL-Bench/Darwin-TTS-1.7B-Cross which is licensed under Apache 2.0. This AWQ quantization retains the same Apache 2.0 license — see the LICENSE file in this repository for the full text.

Model summary

Property	Value
Base model	FINAL-Bench/Darwin-TTS-1.7B-Cross
Architecture	Qwen3TTSForConditionalGeneration
Model type	`qwen3_tts`
TTS model size	~2.1B total (talker 28L + code_predictor + codec + encoder/decoder)
TTS model type	`base` with 3% FFN blend from Qwen3-1.7B LLM
Tokenizer	12 Hz RVQ codec tokenization
Speaker encoder	x-vector based (`x_vector_only_mode=True`), sample_rate=24 000 Hz
Original precision	BF16 safetensors (~3.4 GB)
Quantized precision	AWQ INT4 (group_size=128, GEMM, zero_point=True) — talker weights only
Languages	10 (ko, en, ja, zh + 6 more)
Disk footprint	~0.9 GB
Inference VRAM	~1.2 GB (fits alongside other specialists)
Quantization library	AutoAWQ 0.2.9
Calibration set	128 TTS prompts (short natural speech samples), max_seq_len=512
RNG seed	1729 (NOESIS reproducibility lock)

Architecture note: Darwin-TTS-1.7B-Cross is a merge model — not a fine-tuned model. It blends 3% of FFN weights from Qwen3-1.7B (LLM) into the talker backbone of Qwen3-TTS-12Hz-1.7B-Base (84 / 976 tensors: gate/up/down_proj × 28 talker layers). This weight-space arithmetic (pure lerp, no training) adds emotional expressiveness while preserving TTS stop-signal stability.

4-module structure:

talker — 28-layer Qwen3 LLM backbone (AWQ-quantized)

code_predictor — 5-layer token head, hidden_size=1024 (BF16, untouched)

speech_tokenizer — 12 Hz RVQ codec (BF16, untouched)

encoder/decoder — audio waveform pipeline (BF16, untouched)

The "Cross" refers to cross-lingual voice cloning using x-vector speaker embedding (x_vector_only_mode=True). At α≥10% the LLM's generation pattern overrides the TTS stop signal — hence the conservative 3% blend ratio.

⚠️ AWQ compatibility: AutoAWQ quantizes the talker (language model head) component. Non-standard modules (code_predictor, speaker_encoder) are kept in BF16 within the checkpoint. Verify output with validate_awq_quality.py before using as a KD teacher.

Why this quantization

The original Darwin-TTS-1.7B-Cross weights are in BF16 (~3.4 GB). While this fits VRAM in principle, AWQ INT4 reduces VRAM to ~1.2 GB, enabling parallel loading alongside other NOESIS specialists in the sequential swapping pipeline.

This AWQ build:

Reduces VRAM from ~3.4 GB to ~1.2 GB — freeing headroom for KV cache
Uses GEMM kernel — compatible with device_map={"":0} (no CPU offload)
Provenance-tracked (noesis_provenance.json ships alongside the model)
Calibrated on natural TTS prompts matching cross-lingual synthesis distribution

Quantization methodology

This checkpoint was produced using a proprietary quantization pipeline developed by AMAImedia as part of the NOESIS DHCF-FNO framework (v14.7). The Qwen3TTSForConditionalGeneration architecture is not present in the transformers main branch and has no upstream AutoAWQ support; quantization required original engineering work developed internally at AMAImedia.

How to use

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
import torch

model_id = "amaimedia/Qwen3-1.7B-TTS-Cross-Darwin-NOESIS-AWQ-INT4"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoAWQForCausalLM.from_quantized(
    model_id,
    device_map={"": 0},
    torch_dtype=torch.float16,
    fuse_layers=False,
    trust_remote_code=True,
)

# TTS inference requires additional processing — see base model card for full pipeline
prompt = "Hello, how are you today?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
out = model.generate(**inputs, max_new_tokens=256, do_sample=False)

Note: Full TTS synthesis (audio waveform output) requires the complete NOESIS TTS pipeline with NanoCodec vocoder. The AWQ checkpoint above is suitable for logit extraction (KD) and text-side token generation.

NOESIS context

In NOESIS this model serves as the TTS stream-A teacher for Specialist M3-TTS during knowledge distillation. It provides cross-lingual text-side logits (vocab space) used in build_ensemble_labels.py with proposed weight w=0.20 (secondary, combined with codec-side teachers via NanoCodec).

NOESIS TTS knowledge distillation uses two parallel streams:

Stream	Vocab	Teachers
Stream A (text-side)	151 936 (Qwen3)	Darwin-TTS-1.7B-Cross + others
Stream B (codec-side)	65 536 (NanoCodec)	nanocodec_distill.py

NOESIS specialists overview:

ID	Role	Size
M1	ASR (150+ langs)	10B/3B
M2	Dubbing LM (30 langs full)	10B/3B
M3	TTS + voice cloning	10B/3B
M4	Chat + creative writing	10B/3B
M5	Code + math	10B/3B
M6	Deep research (1M ctx)	10B/3B
M7	Prompt engineering	4B/0.8B
M8	Quality control (PRM)	4B/0.8B
M9	Orchestrator + routing	4B/0.8B

Acknowledgements & citation

Base model: Darwin-TTS-1.7B-Cross by FINAL-Bench (Qwen3 TTS architecture, cross-lingual synthesis).

@misc{darwin_tts_cross,
  title     = {Darwin-TTS-1.7B-Cross},
  author    = {FINAL-Bench},
  year      = {2025},
  publisher = {HuggingFace},
  url       = {https://huggingface.co/FINAL-Bench/Darwin-TTS-1.7B-Cross}
}

Quantization & NOESIS integration:

@misc{noesis_v14,
  title     = {NOESIS v14.7: DHCF-FNO Multilingual Dubbing Platform},
  author    = {Bolotnikov, Ilia},
  year      = {2026},
  publisher = {AMAImedia},
  url       = {https://amaimedia.com}
}

AMAImedia/Qwen3-1.7B-TTS-Cross-Darwin-NOESIS-AWQ-INT4

Qwen3-1.7B-TTS-Cross-Darwin-NOESIS-AWQ-INT4

⚠️ License notice

Model summary

Why this quantization

Quantization methodology

How to use

NOESIS context

Acknowledgements & citation

No reviews yet

Model Info

Community

Rating Guidelines