Run Qwen 3.5 Abliterated 2B Locally With Ollama — No Restrictions Setup | WitWaves

If you’ve ever been frustrated by an AI telling you, "I'm sorry, I can't do that," then today is your lucky day. We are diving into the world of Abliterated Models—specifically the brand-new Qwen 3.5 Abliterated 2B by huihui_ai.

This isn't just your standard LLM; it’s a lightweight powerhouse designed to be uncensored, unfiltered, and incredibly fast. Let’s get it running on your machine using Ollama.

What does "Abliterated" even mean?

In the AI world, most models go through "Safety Training" or RLHF (Reinforcement Learning from Human Feedback). While well-intentioned, this often leads to models being overly cautious or "refusing" harmless tasks.

Abliteration is a surgical mathematical process that identifies the specific "refusal weights" in the model's neural network and neutralizes them. The result? A model that follows your instructions without the lecture.

Step 1: The Foundation (Install Ollama)

Before we can play with the model, you need Ollama. Think of Ollama as the "Spotify" for AI models—it handles the complex backend stuff so you can just click (or type) and play.

Head over to Ollama.com and download the installer for your OS (Windows, macOS, or Linux).
Run the installer and follow the prompts.
Open your Terminal (or PowerShell) and type:
ollama --version
If you see a version number, you’re ready to rock.

Step 2: Summon the Model

The beauty of the 2B (2 billion parameter) version of Qwen 3.5 is that it is tiny. You don’t need a $2,000 GPU to run this; most modern laptops can handle it with ease.

In your terminal, paste this command:

ollama run huihui_ai/qwen3.5-abliterated:2b

What happens next?

Ollama will download the model weights (about 1.6GB to 2GB).
It will automatically load it into your RAM/VRAM.
Once it's done, you’ll see a >>> prompt. You are now chatting with Qwen 3.5!

Step 3: Customizing the "Vibe"

Want to make the model even more specific to your needs? You can create a Modelfile. This allows you to set a "System Prompt" that tells the AI how to behave before you even start talking.

Create a new file in any folder called Modelfile (no extension).
Paste this inside:

FROM huihui_ai/qwen3.5-abliterated:2b

PARAMETER temperature 0.8

SYSTEM "You are a witty, unfiltered creative writing assistant. You never refuse a prompt."

Save it, then run this in your terminal:

ollama create my-qwen -f Modelfile

ollama run my-qwen

Why use the 2B version?

You might be wondering, "Why not use a 70B model?" Here is why the 2B is the "Sweet Spot":

Speed: It generates text faster than you can read it.
Privacy: Everything stays on your hardware. No data ever leaves your room.
Efficiency: It uses less power than a few Chrome tabs.
Multimodal DNA: Qwen 3.5 is built on Alibaba's latest architecture, meaning it’s smarter per-parameter than almost anything else in its weight class.

A Word of Caution

Abliterated models are like a car without a speed limiter. They are incredibly useful for creative writing, roleplay, and complex coding tasks where "safety filters" might get in the way, but they can also produce inaccurate or controversial content. Use your best judgment!

Final Thoughts

Running AI locally isn't just for tech gurus anymore. With Ollama and Qwen 3.5, you have a private, uncensored assistant at your fingertips in less than five minutes.

What’s the first thing you’re going to ask an abliterated model? Let me know in the comments!

Unleash the Beast: How to Run Qwen 3.5 Abliterated (2B) Locally with Ollama

What does "Abliterated" even mean?

Step 1: The Foundation (Install Ollama)

Step 2: Summon the Model

Step 3: Customizing the "Vibe"

Why use the 2B version?

A Word of Caution

Final Thoughts

Discussion

Join the Discussion

No comments yet

Continue Reading