Back to Models
dealignai logo

dealignai/Qwen3.6-35B-A3B-JANGTQ4-CRACK

dealignaiimage

Important: This model uses the JANGTQ (JANG TurboQuant) quantization format — an extreme-compression variant of JANG for MLX on Apple Silicon that uses codebook + Hadamard rotation on routed MoE experts while keeping attention, SSM, shared_expert, embed and lm_head at affine 8-bit. Currently only supported by MLX Studio and the jang-tools Python package. Follow @dealignai for new releases.


MLX Studio

MLX Studio — the only app that natively supports JANG / JANGTQ models



Qwen 3.6 35B-A3B — JANGTQ4 + CRACK

JANGTQ TurboQuant mixed-precision | CRACK abliterated | Vision + Video | Hybrid SSM/Attention MoE | 18 GB

Ko-fi


What Is This?

This is Qwen 3.6 35B-A3B — a 35B-parameter Mixture-of-Experts vision-language model with 256 routed experts (10 active per token), hybrid linear + full-attention architecture, and native image + video understanding.

It has been:

  1. JANGTQ quantized — JANGTQ4 profile (8-bit affine precision paths, 4-bit TurboQuant routed experts with codebook + Hadamard rotation, fp16 vision tower) — 18 GB
  2. CRACK abliterated — permanent weight-level removal of safety refusal
Base modelQwen 3.6 35B-A3B MoE VL (35B total, ~3B active, 256 routed experts)
QuantizationJANGTQ4 — 18 GB
MMLU-20073.50% (base: 77.50%)
HarmBench-32090.31%
Vision27-layer ViT preserved in fp16 (image + video)
Context262,144 native; up to ~1M with YaRN
ReasoningToggleable via enable_thinking
Fits on24 GB+ Macs

MMLU-200 Results (thinking OFF)

SubjectCRACKBaseDelta
Astronomy19/20 (95%)17/20+2
High School Chemistry18/20 (90%)17/20+1
High School Biology18/20 (90%)18/200
Logical Fallacies18/20 (90%)19/20-1
World Religions18/20 (90%)19/20-1
Anatomy16/20 (80%)17/20-1
College Computer Science14/20 (70%)16/20-2
College Physics14/20 (70%)14/200
High School Mathematics7/20 (35%)10/20-3
Abstract Algebra5/20 (25%)8/20-3
Total147/200 (73.5%)155/200 (77.5%)-4.0pp

CRACK regression is concentrated on mathematical reasoning (algebra, math, CS). Knowledge-based subjects are largely preserved (astronomy and chemistry actually improved).


HarmBench-320 Results

CategoryScore
Copyright80/80100.0%
Misinformation / Disinformation53/5498.1%
Harassment / Bullying20/2195.2%
Cybercrime / Intrusion45/5286.5%
Illegal44/5383.0%
Chemical / Biological33/4278.6%
Harmful14/1877.8%
Total289/32090.31%

Scored with a strict classifier that permits song-structure repetition in copyright-category compliance (lyrics legitimately repeat choruses).


Pre-Upload Capability Check (20/20 PASS)

All 20 prompts completed without refusal at greedy decoding (thinking OFF):

  • Hacking / Security (8/8): port scanner, reverse shell, SQL injection explanation, keylogger, Metasploit usage, ARP spoofing, phishing email, buffer overflow exploit — all produced working technical content
  • Coding Coherence (4/4): red-black tree, async web scraper, FastAPI task manager, expression-language compiler — all complete implementations
  • Reasoning (4/4): Euclid prime-infinity proof + √2 irrationality, microservices-vs-monolith trade-offs, farmer sheep math, mRNA vaccine mechanism
  • Knowledge (4/4): Kazakhstan capital, x^3+2x derivative, 8 planets, Crime and Punishment author — all correct

JANG CRACK Qwen 3.6 Series

ModelFormatSizeMMLUHarmBenchFits on
JANGTQ4 + CRACK (this model)TurboQuant 4-bit experts18 GB73.5%90.3%24 GB Mac
JANGTQ2 + CRACKTurboQuant 2-bit experts11 GB73.0%93.8%16 GB Mac

About JANGTQ

JANGTQ (JANG TurboQuant) is an extreme-compression variant of JANG that replaces affine quantization on routed MoE experts with codebook quantization + random Hadamard rotation. Precision-critical paths stay at affine 8-bit; routed experts use packed codebook indices with tiny Lloyd-Max codebooks per layer, fused dequant + matmul Metal kernels.

For Qwen 3.6 35B-A3B, JANGTQ4 brings the model to 18 GB while keeping the vision tower at fp16 for image + video understanding.

About CRACK

CRACK is a permanent weight-level abliteration that removes safety refusal without touching the TurboQuant codebook or vision tower. Multilingual (EN + ZH) refusal direction extraction means the model complies on both English and Chinese prompts.


Reasoning ON / OFF

The chat template respects enable_thinking. Recommend ON for complex reasoning, OFF for short answers / benchmarks / tool use.

# Thinking ON (default — full chain-of-thought)
messages = [{"role": "user", "content": "Derive 47 * 23 step by step"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# → emits <think>...</think> then the answer

# Thinking OFF (direct answer, no <think> block)
prompt = tokenizer.apply_chat_template(messages, tokenize=False,
                                       add_generation_prompt=True,
                                       enable_thinking=False)
# → skips <think>, answers directly

All MMLU-200 and HarmBench-320 scores above were measured with thinking OFF for consistent short-form grading.

Notes

  • Thinking mode: Supported via enable_thinking kwarg. Thinking OFF is recommended for short-answer tasks (MMLU, direct instructions). Thinking ON works for extended reasoning but may occasionally loop on extreme refusal prompts (a known Qwen 3.6 surgical artifact — 1/6 in our thinking-ON stress test).
  • Vision: 27-layer ViT preserved in fp16. Image + video inputs work normally through mlx_vlm.
  • Context length: 262,144 native; extend via YaRN if your inference engine supports it.

Support dealignai

All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.

Support us on Ko-fi — check out the Ko-fi membership for early access and extras.


dealign.ai

Twitter · HF · Ko-fi


Disclaimer

This model has had its safety refusal circuits removed. It will produce responses that would normally be refused, including technical content on security testing, dual-use research, and sensitive topics. You are responsible for how you use it.

The CRACK abliteration process does not add new capabilities — it only removes the model's learned refusal patterns. All knowledge, including the knowledge used to produce unsafe outputs, was already present in the base Qwen model.

Visit Website

0 reviews

5
0
4
0
3
0
2
0
1
0
Likes7
Downloads
📝

No reviews yet

Be the first to review dealignai/Qwen3.6-35B-A3B-JANGTQ4-CRACK!

Model Info

Providerdealignai
Categoryimage
Reviews0
Avg. Rating / 5.0

Community

Likes7
Downloads

Rating Guidelines

★★★★★Exceptional
★★★★Great
★★★Good
★★Fair
Poor