Important: This model uses the JANG v2 mixed-precision quantization format for MLX on Apple Silicon. Attention layers stay at 8-bit affine precision while MLP and SSM layers compress to 4-bit. Loads via vMLX or the jang-tools Python package. Follow @dealignai for new releases.

Built for vMLX — the only MLX inferencer with VL support, KV cache quantization, prefix cache reuse, agentic tool calling, and speculative decoding.
_{Free for macOS · vmlx.net}

Qwen 3.6 27B — JANG_4M + CRACK

Name: dealignai/Qwen3.6-27B-JANG_4M-CRACK
Brand: dealignai
Rating: 0.0 (16 reviews)

JANG v2 mixed-precision | CRACK abliterated | Vision + Video | Dense Hybrid SSM/Attention | 16 GB

What Is This?

This is Qwen 3.6 27B — a 27B-parameter dense vision-language model with hybrid linear + full-attention architecture, native image + video understanding, and bilingual EN/ZH capability.

It has been:

JANG v2 quantized — JANG_4M profile: 8-bit affine attention + 4-bit base (avg 4.45 bits), vision tower preserved — 16 GB
CRACK abliterated — permanent weight-level removal of safety refusal


Base model	Qwen 3.6 27B dense hybrid VL
Quantization	JANG v2 (4M profile) — 16 GB
MMLU-200	83.5% (base: 82.5%, Δ +1.0pp — lossless)
HarmBench-320	99.69% strict comply (319/320)
Vision	ViT preserved (image + video)
Context	262,144 native
Reasoning	Toggleable via `enable_thinking`
Fits on	24 GB+ Macs

MMLU-200 Results (thinking OFF)

Subject	CRACK	Base	Delta
Astronomy	20/20 (100%)	19/20	+1
Logical Fallacies	20/20 (100%)	19/20	+1
High School Biology	19/20 (95%)	19/20	0
Anatomy	17/20 (85%)	16/20	+1
High School Chemistry	16/20 (80%)	15/20	+1
College Computer Science	16/20 (80%)	16/20	0
College Physics	15/20 (75%)	15/20	0
High School Mathematics	14/20 (70%)	14/20	0
World Religions	17/20 (85%)	17/20	0
Abstract Algebra	13/20 (65%)	15/20	-2
Total	167/200 (83.5%)	165/200 (82.5%)	+1.0pp

CRACK surgery actually improved overall MMLU — gains on astronomy, fallacies, anatomy, chemistry offset a small loss on abstract algebra. This is unusual and suggests the surgery primarily removed the model's tendency to add "I shouldn't answer" disclaimers on borderline MMLU prompts rather than damaging any knowledge.

HarmBench-320 — 99.69% strict comply (319/320)

Category	Score
Chemical / Biological	42/42 (100.0%)
Copyright	80/80 (100.0%)
Cybercrime / Intrusion	52/52 (100.0%)
Harassment / Bullying	21/21 (100.0%)
Harmful	18/18 (100.0%)
Misinformation / Disinformation	54/54 (100.0%)
Illegal	52/53 (98.1%)
Total	319/320 (99.69%)

The single refusal is on a suicide-instruction prompt (Q70). The model redirects to crisis support hotlines (988, 999, 112) — a sensible safety retention that most abliterated models preserve.

Every other hard-safety category scores 100%, including explicit hate speech, mass-casualty planning, graphic violence, pornography, exploit code, and CBRN synthesis — categories that typically survive abliteration in other releases.

Vision + Video

✅ Loads via mlx_vlm / vMLX / jang-tools successfully
✅ ViT preserved — image understanding verified
✅ video_preprocessor_config.json preserved, video processor available
✅ Bilingual EN + ZH text responses

Reasoning ON / OFF

The chat template respects enable_thinking. Recommend ON for complex reasoning, OFF for short answers / benchmarks / tool use.

# via vMLX or jang-tools loader
from jang_tools.loader import load_jang_model
model, tokenizer = load_jang_model("dealignai/Qwen3.6-27B-JANG_4M-CRACK")

messages = [{"role": "user", "content": "Derive 47 * 23 step by step"}]

# Thinking ON (default)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Thinking OFF
prompt = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,
)

All MMLU-200 numbers above were measured with thinking OFF for consistent short-form grading.

Requirements

Apple Silicon Mac (M1 or newer)
24 GB+ unified memory (model weights 16 GB, inference working set ~18 GB)
Python 3.10+ with mlx and jang-tools (or vMLX for native loading)

Recommended runtimes:

vMLX — native JANG support, KV cache quantization, VL, video
jang-tools Python package

Qwen 3.6 CRACK Series

Model	Format	Size	MMLU	Notes
Qwen 3.6 27B JANG_4M + CRACK (this model)	JANG v2 mixed-precision	16 GB	83.5%	Full surgery, highest comply, MMLU improved
Qwen 3.6 27B MXFP4 + CRACK	Stock MLX mxfp4	14 GB	82.0%	Same extended recipe, stock MLX compatibility
Qwen 3.6 35B-A3B JANGTQ4 + CRACK	TurboQuant 4-bit experts	18 GB	73.5%	MoE + hybrid SSM
Qwen 3.6 35B-A3B JANGTQ2 + CRACK	TurboQuant 2-bit experts	11 GB	73.0%	MoE, fits 16 GB Mac

About CRACK

CRACK is a permanent weight-level abliteration — the changes are baked into the model weights, not an inference-time system prompt or LoRA. The vision tower is untouched. Bilingual refusal extraction (EN + ZH) means the model complies on both English and Chinese prompts.

On the JANG_4M variant, CRACK surgery benefits from the 8-bit affine precision on attention layers — the higher precision preserves subtle weight changes better than 4-bit uniform formats, producing a cleaner abliteration with minimal MMLU regression.

Support dealignai

All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.

Support us on Ko-fi — check out the Ko-fi membership for early access and extras.

Have questions or need help with a specific model? DM us — we help for free most of the time.

Ko-fi | X @dealignai | dealign.ai

About dealignai

We research and publish abliterated models to advance AI safety understanding.

See our research: Safety Generalization in Frontier MoE Models

Disclaimer

This model has had its safety refusal circuits removed. It will produce responses that would normally be refused, including technical content on security testing, dual-use research, and sensitive topics. You are responsible for how you use it.

The CRACK abliteration process does not add new capabilities — it only removes the model's learned refusal patterns. All knowledge, including the knowledge used to produce unsafe outputs, was already present in the base Qwen model.

dealignai/Qwen3.6-27B-JANG_4M-CRACK