Back to Models
OpenMed logo

OpenMed/privacy-filter-nemotron-mlx

OpenMedgeneral

OpenMed Privacy Filter (Nemotron) — MLX BF16

A native MLX port of OpenMed/privacy-filter-nemotron for fast, on-device PII detection on Apple Silicon. This BF16 artifact preserves the full source precision; for a smaller / faster sibling, see OpenMed/privacy-filter-nemotron-mlx-8bit.

Family at a glance. Same architecture and training data, three runtimes:

What it does

The model is a token classifier built on OpenAI's open Privacy Filter architecture (the same openai_privacy_filter model type used by openai/privacy-filter). It tags each token with a BIOES label across 55 PII span classes, then a Viterbi pass over the BIOES grammar yields clean entity spans. Detected categories include:

  • Personal identifiers — first_name, last_name, user_name, gender, age, date_of_birth
  • Contact — email, phone_number, fax_number, street_address, city, state, country, county, postcode, coordinate
  • Government / legal IDs — ssn, national_id, tax_id, certificate_license_number
  • Financial — account_number, bank_routing_number, credit_debit_card, cvv, pin, swift_bic
  • Medical — medical_record_number, health_plan_beneficiary_number, blood_type
  • Workplace — company_name, occupation, employee_id, customer_id, employment_status, education_level
  • Online — url, ipv4, ipv6, mac_address, http_cookie, api_key, password, device_identifier
  • Demographic — race_ethnicity, religious_belief, political_view, sexuality, language
  • Vehicles — license_plate, vehicle_identifier
  • Time — date, date_time, time
  • Misc — biometric_identifier, unique_id
Full label schema (221 labels)

The output space is O plus B-, I-, E-, S- for each of the 55 span classes (4 × 55 + 1 = 221). The runtime PrivacyFilterMLXPipeline runs Viterbi over this BIOES grammar, so the consumer sees clean grouped entities rather than raw token tags.

The full id2label.json is shipped alongside the weights in this repo.

For per-label accuracy, training recipe, and dataset details, see the base PyTorch checkpoint.

Architecture

FieldValue
Source model typeopenai_privacy_filter
Source architectureOpenAIPrivacyFilterForTokenClassification
Hidden size640
Transformer layers8
AttentionGrouped-Query (14 query heads / 2 KV heads, head_dim=64) with attention sinks
FFNSparse Mixture-of-Experts — 128 experts, top-4 routing, SwiGLU
Position encodingYARN-scaled RoPE (rope_theta=150_000, factor=32)
Context length131,072 tokens (initial 4,096)
Tokenizero200k_base (tiktoken) — vocab 200,064
Output headLinear(640 → 221) with bias

File set

FileSizePurpose
weights.safetensors2.6 GBBF16 model weights in OpenMed-MLX layout
config.json19 KBModel + MLX runtime config
id2label.json5.4 KBNumeric ID → BIOES label string
openmed-mlx.json0.7 KBOpenMed MLX manifest (task, family, runtime hints)
tokenizer.json, tokenizer_config.json27 MBSource tokenizer files (kept for reference)

The MLX runtime uses tiktoken o200k_base directly for tokenization; the tokenizer.json is kept so consumers can inspect or re-tokenize via transformers if desired.

Quick start

With OpenMed — recommended

OpenMed gives you a single extract_pii() / deidentify() API that auto-selects MLX on Apple Silicon and PyTorch elsewhere — same code on every host.

pip install -U "openmed[mlx]"
from openmed import extract_pii, deidentify

text = (
    "Patient Sarah Johnson (DOB 03/15/1985), MRN 4872910, "
    "phone 415-555-0123, email sarah.johnson@example.com."
)

# Extract grouped entity spans (runs on MLX here, PyTorch fallback elsewhere)
result = extract_pii(text, model_name="OpenMed/privacy-filter-nemotron-mlx")
for ent in result.entities:
    print(f"{ent.label:30s} {ent.text!r}  conf={ent.confidence:.2f}")

# De-identify
masked = deidentify(text, method="mask",
                    model_name="OpenMed/privacy-filter-nemotron-mlx")
fake   = deidentify(
    text,
    method="replace",
    model_name="OpenMed/privacy-filter-nemotron-mlx",
    consistent=True,
    seed=42,   # deterministic locale-aware Faker surrogates
)

When MLX isn't available (Linux, Windows, Intel Mac, missing mlx package), this exact same call automatically falls back to the PyTorch checkpoint OpenMed/privacy-filter-nemotron with a one-time warning. Family-aware fallback: a Nemotron MLX request never substitutes the unrelated openai/privacy-filter baseline.

Direct MLX usage (lower-level)

from huggingface_hub import snapshot_download
from openmed.mlx.inference import PrivacyFilterMLXPipeline

model_path = snapshot_download("OpenMed/privacy-filter-nemotron-mlx")
pipe = PrivacyFilterMLXPipeline(model_path)

print(pipe("Email me at alice.smith@example.com after 5pm."))
# [{'entity_group': 'email',
#   'score': 0.92,
#   'word': 'alice.smith@example.com',
#   'start': 12,
#   'end': 35}]

The pipeline returns a list of dicts with entity_group, score, word, start, and end (character offsets into the input string).

Loading from a local snapshot

from openmed.mlx.models import load_model
import mlx.core as mx

model = load_model("/path/to/privacy-filter-nemotron-mlx")
ids = mx.array([[1, 100, 200, 300]], dtype=mx.int32)
mask = mx.ones((1, 4), dtype=mx.bool_)
logits = model(ids, attention_mask=mask)   # shape (1, 4, 221)

Hardware notes

  • Designed for Apple Silicon (M-series GPUs); CPU inference works but is slower.
  • Tested on macOS with mlx>=0.18. The MLX runtime in this repo is independent of mlx_lm (token classification, not causal LM).
  • Forward pass on a typical PII sentence (~10 tokens) takes ~14 ms on M-series GPU after warmup. For lower latency or smaller memory footprint, use the -mlx-8bit sibling instead.

Credits & Acknowledgements

This model wouldn't exist without two open-source releases — sincere thanks to both teams:

  • OpenAI for open-sourcing the Privacy Filter (architecture, modeling code, and opf training/eval CLI). The MLX port in this repo runs that same architecture under Apple's MLX framework.
  • NVIDIA for releasing the Nemotron-PII dataset used to fine-tune the source PyTorch checkpoint.

Additional thanks to Apple for MLX and the HuggingFace team for the model-distribution ecosystem.

License

Apache 2.0 (matches the source checkpoint).

Visit Website

0 reviews

5
0
4
0
3
0
2
0
1
0
Likes6
Downloads
📝

No reviews yet

Be the first to review OpenMed/privacy-filter-nemotron-mlx!

Model Info

ProviderOpenMed
Categorygeneral
Reviews0
Avg. Rating / 5.0

Community

Likes6
Downloads

Rating Guidelines

★★★★★Exceptional
★★★★Great
★★★Good
★★Fair
Poor