Qwen3.6-27B GGUF (AutoRound Quantized)

This repository contains GGUF quantized versions of Qwen/Qwen3.6-27B created using Intel's AutoRound quantization method.

Quantization Details

The models were quantized using various schemes provided by the auto-round tool. For better compatibility and smaller size, we provide unified multimodal projector (mmproj) files in F16, BF16, and F32 formats.

Files and Sizes

File Name	Quant Type	Size	Description
`Qwen3.6-27B-Q2_K_S.gguf`	Q2_K_S	8.9 GB	Extremely high compression, significant quality loss.
`Qwen3.6-27B-Q2_K_MIXED.gguf`	Q2_K_MIXED	16 GB	Recommended high-compression option. Uses Q4 for KV cache with good quality. Fast inference.
`Qwen3.6-27B-Q3_K_S.gguf`	Q3_K_S	12 GB	Very high compression, notable quality loss.
`Qwen3.6-27B-Q3_K_M.gguf`	Q3_K_M	12 GB	Balanced 3-bit quantization.
`Qwen3.6-27B-Q3_K_L.gguf`	Q3_K_L	12 GB	High quality 3-bit quantization.
`Qwen3.6-27B-Q4_0.gguf`	Q4_0	15 GB	Standard 4-bit quantization, good balance.
`Qwen3.6-27B-Q4_1.gguf`	Q4_1	16 GB	Higher quality 4-bit quantization than Q4_0.
`Qwen3.6-27B-Q4_K_S.gguf`	Q4_K_S	15 GB	Small 4-bit K-quant, good efficiency.
`Qwen3.6-27B-Q4_K_M.gguf`	Q4_K_M	15 GB	Recommended 4-bit K-quant, excellent balance.
`Qwen3.6-27B-Q5_0.gguf`	Q5_0	18 GB	Standard 5-bit quantization, very high quality.
`Qwen3.6-27B-Q5_1.gguf`	Q5_1	19 GB	Higher quality 5-bit quantization than Q5_0.
`Qwen3.6-27B-Q5_K_S.gguf`	Q5_K_S	18 GB	Small 5-bit K-quant, very high quality.
`Qwen3.6-27B-Q5_K_M.gguf`	Q5_K_M	18 GB	Recommended 5-bit K-quant, near-lossless.
`Qwen3.6-27B-Q6_K.gguf`	Q6_K	21 GB	6-bit K-quant, virtually indistinguishable from F16.
`Qwen3.6-27B-Q8_0.gguf`	Q8_0	29 GB	8-bit quantization, near-lossless.
`mmproj-model-f16.gguf`	F16	885 MB	Unified Projector in Float16 format.
`mmproj-model-bf16.gguf`	BF16	889 MB	Unified Projector in BFloat16 format.
`mmproj-model-f32.gguf`	F32	1.8 GB	Unified Projector in Float32 format.

Generate the Model

The models were generated using Intel's AutoRound with the following command:

auto-round --model Qwen/Qwen3.6-27B --output_dir ./quantized/ --scheme <SCHEME> --iters 0

Usage with llama.cpp

These models can be used with llama.cpp. For multimodal usage, you must specify the projector file:

./llama-cli -m Qwen3.6-27B-Q4_K_M.gguf --mmproj mmproj-model-f16.gguf --image your_image.jpg -p "Describe this image."

About AutoRound

AutoRound is an advanced quantization technique from Intel that aims to minimize accuracy loss through automated rounding optimization.

sphaela/Qwen3.6-27B-AutoRound-GGUF

Qwen3.6-27B GGUF (AutoRound Quantized)

Quantization Details

Files and Sizes

Generate the Model

Usage with llama.cpp

About AutoRound

No reviews yet

Model Info

Community

Rating Guidelines