Best Settings to Run Qwen3-30B-A3B Locally
If youâre running Qwen3-30B-A3B locally, donât guess your way through the settings. This guide tells you what actually works based on Qwenâs own documentation and what weâve seen hold up in practice.
Qwen3 comes with a unique toggle: enable_thinking
. When itâs on, the model âthinksâ, it breaks down problems, reasons step-by-step, and wraps part of its output in a <think>...</think>
block. When itâs off, the model skips all that and just gives you an answer.
That changes how you configure it.
Thinking mode (enable_thinking=True
)
This is the mode for reasoning, math, coding, logic â anything that benefits from step-by-step generation.
Use these generation settings:
Temperature: 0.6TopP: 0.95TopK: 20Max tokens: 32,768Do not use greedy decoding
Quick summary

Non-thinking mode (enable_thinking=False
)
This is for fast, general-purpose replies. Instruction following, chat, creative writing â no <think>
block, no extra steps.
Use these settings:
Temperature: 0.7TopP: 0.8TopK: 20
Soft vs. hard switch
You can toggle thinking dynamically in the prompt using:
/think # turns thinking ON/no_think # turns it OFF
This works only if enable_thinking=True
is set in the code. If you set it to False, the soft switch wonât do anything.
What most people miss
- Donât log the
think
block in chat history. Qwen recommends keeping only the final answer. Otherwise, the next reply gets bloated and off-topic. - Greedy decoding is a trap. Itâs tempting to use for consistency, but Qwen3âs output gets worse - and sometimes broken - without sampling.
- YaRN isnât always needed. The model supports up to 32k context by default. Use YaRN only if you regularly go beyond that.
Running Qwen3 locally with Jan
The easiest way to run Qwen3-30B-A3B locally is through Jan.
- Download and install Jan
- Open Jan and navigate to Jan Hub
- Find
Qwen3
andQwen3-30B-A3B
in the model list - Click âDownloadâ to get the model
Qwen3 in Jan Hub
You can easily find Qwen3 models in Jan Hub:

Once downloaded, Jan handles all the technical setup, so you can focus on using the model rather than configuring it. The settings we covered in this guide are automatically applied when you use Qwen3 through Jan.
How to customize Qwen3-30B-A3B settings in Jan
You can also customize these settings anytime by opening the right panel in Jan and adjusting the parameters to match your needs.

Bottom Line
If youâre running Qwen3-30B-A3B locally, treat it like two models in one. Flip the thinking mode based on the task, adjust the generation settings accordingly, and let it work how it was meant to.