Huihui Qwen3.6-35B A3B Abliterated (GGUF)

This repository provides GGUF format quantizations for the huihui-ai/Huihui-Qwen3.6-35B-A3B-abliterated model.

Because this model has been fully "abliterated" to bypass alignment and safety refusals, it acts as a highly capable engine for unrestricted creative writing, dynamic storytelling, and immersive roleplay scenarios.

Available Quantizations

File	Bit Size	Description
`huihui-35B-Q8_0.gguf`	8-bit	Highest quality quant, virtually indistinguishable from F16.
`huihui-35B-Q6_K.gguf`	6-bit	Excellent quality with a noticeably reduced memory footprint.
`huihui-35B-Q5_K_M.gguf`	5-bit	Great balance between reasoning performance and RAM usage.
`huihui-35B-Q4_K_M.gguf`	4-bit	Recommended. The optimal sweet spot for speed and quality.
`huihui-35B-Q4_K_S.gguf`	4-bit	Slightly smaller than K_M, allowing for faster inference on constrained setups.
`huihui-35B-Q3_K_M.gguf`	3-bit	Lowest resource requirement, though perplexity loss becomes more noticeable.

Quick Start (llama.cpp)

These models are designed to be run directly via llama.cpp. The following commands are standard for local Linux environments (such as Linux Mint or Ubuntu).

1. Clone and compile via CMake:

git clone [https://github.com/ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp)
cd llama.cpp
cmake -B build
cmake --build build --config Release

Abiray/Huihui-Qwen3.6-35B-A3B-abliterated-GGUF

Huihui Qwen3.6-35B A3B Abliterated (GGUF)

Available Quantizations

Quick Start (llama.cpp)

No reviews yet

Model Info

Community

Rating Guidelines