Back to Models
CH

cHunter789/Qwen3.6-27B-i1-IQ4_XS-GGUF

cHunter789general

Qwen3.6-27B-i1-IQ4_XS (Fully Optimized)

Motivation

Recent updates in the llama.cpp repository (specifically commit 1dab5f5a44) introduced a hardcoded minimum quantization of q5_K for attn_qkv layers. While this was likely intended to preserve model quality, it causes a noticeable bloat in the final file sizes.

For comparison, the highly efficient Qwen3.5-27B iq4_xs by mradermacher weighed in at 14.7GB, whereas the equivalent Qwen3.6 i1-GGUF under the new commit rules swelled to over 15.1GB.

Methodology

To restore the optimal balance of size and performance, I modified the llama.cpp source code to revert the quantization of attn_qkv layers back to a pure IQ4_XS format. This mirrors the exact 1:1 layer quantization strategy originally used in mradermacher's Qwen3.5-27B release.

This model was quantized utilizing the imatrix provided by mradermacher: Qwen3.6-27B-i1-GGUF.

Performance vs. Size Trade-off

Extensive perplexity testing (llama-perplexity with pg19.txt, 65k context, Q8_0 cache) confirms that forcing pure IQ4_XS across all layers results in a statistically insignificant intelligence drop (+0.0039 PPL) while noticeably reducing the memory footprint.

./llama-perplexity -m Qwen3.6-27B.i1-IQ4_XS.gguf -f pg19.txt -c 65536 --chunks 32 -ngl -1 -ctk q8_0 -ctv q8_0 -fa 1 -b 512 -ub 128

./llama-perplexity -m Qwen3.6-27B.i1-IQ4_XS-attn_qkv-IQ4_XS.gguf -f pg19.txt -c 65536 --chunks 32 -ngl -1 -ctk q8_0 -ctv q8_0 -fa 1 -b 512 -ub 128

🧠 Intelligence (Perplexity) Comparison

Model VersionPerplexity (PPL)Difference / Quality Drop
Standard IQ4_XS (with q5_K attn_qkv)7.3765 ± 0.02760Baseline
Custom IQ4_XS (pure / fully iq4)7.3804 ± 0.02762+ 0.0039 (Negligible)

Conclusion: By utilizing this custom build, users save 375 MiB of active memory and reduce the static file size closer to the 14.7GB mark, with a practically non-existent impact on output quality (~0.05% PPL variance).

Visit Website

0 reviews

5
0
4
0
3
0
2
0
1
0
Likes5
Downloads
📝

No reviews yet

Be the first to review cHunter789/Qwen3.6-27B-i1-IQ4_XS-GGUF!

Model Info

ProvidercHunter789
Categorygeneral
Reviews0
Avg. Rating / 5.0

Community

Likes5
Downloads

Rating Guidelines

★★★★★Exceptional
★★★★Great
★★★Good
★★Fair
Poor