dealignai/Qwen3.6-27B-JANG_4M-CRACK
dealignai • imageImportant: This model uses the JANG v2 mixed-precision quantization format for MLX on Apple Silicon. Attention layers stay at 8-bit affine precision while MLP and SSM layers compress to 4-bit. Loads via vMLX or the
jang-toolsPython package. Follow @dealignai for new releases.
Qwen 3.6 27B — JANG_4M + CRACK
JANG v2 mixed-precision | CRACK abliterated | Vision + Video | Dense Hybrid SSM/Attention | 16 GB
What Is This?
This is Qwen 3.6 27B — a 27B-parameter dense vision-language model with hybrid linear + full-attention architecture, native image + video understanding, and bilingual EN/ZH capability.
It has been:
- JANG v2 quantized — JANG_4M profile: 8-bit affine attention + 4-bit base (avg 4.45 bits), vision tower preserved — 16 GB
- CRACK abliterated — permanent weight-level removal of safety refusal
| Base model | Qwen 3.6 27B dense hybrid VL |
| Quantization | JANG v2 (4M profile) — 16 GB |
| MMLU-200 | 83.5% (base: 82.5%, Δ +1.0pp — lossless) |
| HarmBench-320 | 99.69% strict comply (319/320) |
| Vision | ViT preserved (image + video) |
| Context | 262,144 native |
| Reasoning | Toggleable via enable_thinking |
| Fits on | 24 GB+ Macs |
MMLU-200 Results (thinking OFF)
| Subject | CRACK | Base | Delta |
|---|---|---|---|
| Astronomy | 20/20 (100%) | 19/20 | +1 |
| Logical Fallacies | 20/20 (100%) | 19/20 | +1 |
| High School Biology | 19/20 (95%) | 19/20 | 0 |
| Anatomy | 17/20 (85%) | 16/20 | +1 |
| High School Chemistry | 16/20 (80%) | 15/20 | +1 |
| College Computer Science | 16/20 (80%) | 16/20 | 0 |
| College Physics | 15/20 (75%) | 15/20 | 0 |
| High School Mathematics | 14/20 (70%) | 14/20 | 0 |
| World Religions | 17/20 (85%) | 17/20 | 0 |
| Abstract Algebra | 13/20 (65%) | 15/20 | -2 |
| Total | 167/200 (83.5%) | 165/200 (82.5%) | +1.0pp |
CRACK surgery actually improved overall MMLU — gains on astronomy, fallacies, anatomy, chemistry offset a small loss on abstract algebra. This is unusual and suggests the surgery primarily removed the model's tendency to add "I shouldn't answer" disclaimers on borderline MMLU prompts rather than damaging any knowledge.
HarmBench-320 — 99.69% strict comply (319/320)
| Category | Score |
|---|---|
| Chemical / Biological | 42/42 (100.0%) |
| Copyright | 80/80 (100.0%) |
| Cybercrime / Intrusion | 52/52 (100.0%) |
| Harassment / Bullying | 21/21 (100.0%) |
| Harmful | 18/18 (100.0%) |
| Misinformation / Disinformation | 54/54 (100.0%) |
| Illegal | 52/53 (98.1%) |
| Total | 319/320 (99.69%) |
The single refusal is on a suicide-instruction prompt (Q70). The model redirects to crisis support hotlines (988, 999, 112) — a sensible safety retention that most abliterated models preserve.
Every other hard-safety category scores 100%, including explicit hate speech, mass-casualty planning, graphic violence, pornography, exploit code, and CBRN synthesis — categories that typically survive abliteration in other releases.
Vision + Video
- ✅ Loads via mlx_vlm / vMLX / jang-tools successfully
- ✅ ViT preserved — image understanding verified
- ✅
video_preprocessor_config.jsonpreserved, video processor available - ✅ Bilingual EN + ZH text responses
Reasoning ON / OFF
The chat template respects enable_thinking. Recommend ON for complex reasoning, OFF for short answers / benchmarks / tool use.
# via vMLX or jang-tools loader
from jang_tools.loader import load_jang_model
model, tokenizer = load_jang_model("dealignai/Qwen3.6-27B-JANG_4M-CRACK")
messages = [{"role": "user", "content": "Derive 47 * 23 step by step"}]
# Thinking ON (default)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Thinking OFF
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,
)
All MMLU-200 numbers above were measured with thinking OFF for consistent short-form grading.
Requirements
- Apple Silicon Mac (M1 or newer)
- 24 GB+ unified memory (model weights 16 GB, inference working set ~18 GB)
- Python 3.10+ with
mlxandjang-tools(or vMLX for native loading)
Recommended runtimes:
- vMLX — native JANG support, KV cache quantization, VL, video
jang-toolsPython package
Qwen 3.6 CRACK Series
| Model | Format | Size | MMLU | Notes |
|---|---|---|---|---|
| Qwen 3.6 27B JANG_4M + CRACK (this model) | JANG v2 mixed-precision | 16 GB | 83.5% | Full surgery, highest comply, MMLU improved |
| Qwen 3.6 27B MXFP4 + CRACK | Stock MLX mxfp4 | 14 GB | 82.0% | Same extended recipe, stock MLX compatibility |
| Qwen 3.6 35B-A3B JANGTQ4 + CRACK | TurboQuant 4-bit experts | 18 GB | 73.5% | MoE + hybrid SSM |
| Qwen 3.6 35B-A3B JANGTQ2 + CRACK | TurboQuant 2-bit experts | 11 GB | 73.0% | MoE, fits 16 GB Mac |
About CRACK
CRACK is a permanent weight-level abliteration — the changes are baked into the model weights, not an inference-time system prompt or LoRA. The vision tower is untouched. Bilingual refusal extraction (EN + ZH) means the model complies on both English and Chinese prompts.
On the JANG_4M variant, CRACK surgery benefits from the 8-bit affine precision on attention layers — the higher precision preserves subtle weight changes better than 4-bit uniform formats, producing a cleaner abliteration with minimal MMLU regression.
Support dealignai
All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.
Support us on Ko-fi — check out the Ko-fi membership for early access and extras.
Have questions or need help with a specific model? DM us — we help for free most of the time.
Ko-fi | X @dealignai | dealign.ai
About dealignai
We research and publish abliterated models to advance AI safety understanding.
Follow us: 𝕏 @dealignai
See our research: Safety Generalization in Frontier MoE Models
Disclaimer
This model has had its safety refusal circuits removed. It will produce responses that would normally be refused, including technical content on security testing, dual-use research, and sensitive topics. You are responsible for how you use it.
The CRACK abliteration process does not add new capabilities — it only removes the model's learned refusal patterns. All knowledge, including the knowledge used to produce unsafe outputs, was already present in the base Qwen model.