Hermes Agent · Field Guide gpt-5.4 · openai-codex
Vol. I · On the Depth of Thought

How hard
should it think?

Before GPT-5.4 answers, it can spend tokens reasoning. You choose how deep it goes — six dials, each with a price.

none min low med high xhi reasoning_effort
Reading  ·  medium
I

Six settings, one spectrum

Cool → Hot
— 01
noneStraight to the answer. Zero thinking tokens.
off
fastest ·
— 02
minimalA breath of hesitation. Formatting, trivial questions.
warm
quick · ~1×
— 03
lowBrief plan before answering. Summaries, small edits.
amber
light · ~2×
— 04
mediumBalanced default. Coding, analysis, most daily work.
glow
steady · ~4×
— 05
highWorks the problem. Multi-step logic, tricky debugging.
hot
slow · ~8×
— 06
xhighMaximum depth. Architecture, long chains, subtle bugs.
molten
deepest · ~16×
II

Four instruments on the console

Where you turn it up
Main dial
Hermes, itself.
agent.reasoning_effort
Sets the reasoning depth for every turn the main agent takes.
none minimal low medium high xhigh
Subagents
Delegated thinking.
delegation.reasoning_effort
A separate dial for subagents Hermes spawns. Cheaper — or deeper — than the main.
inherit or override
Visibility
Show the work.
display.show_reasoning
Prints the reasoning trace alongside the answer. Free — those tokens are already paid.
false true
Mid-chat
Live override.
/reasoning high
Bump, lower, or hide reasoning inside a live session. No restart, no edit.
/reasoning high /reasoning show /reasoning none
III

What you give, what you get

The exchange
— every level is a trade —
fast
slow
cheap
expensive
shallow
deep
light quota
weekly cap
IV

Three prescriptions

Paste & restart
Rx · A
Think harder, always.
# config.yaml agent: reasoning_effort: high
Everyday default shifts up. Better on coding, analysis, planning.
Rx · B
Max depth, show work.
agent: reasoning_effort: xhigh display: show_reasoning: true
Hardest problems. Reasoning trace visible. Burns quota faster.
Rx · C
Smart main, cheap subs.
agent: reasoning_effort: high delegation: reasoning_effort: low
Main agent thinks deep. Delegated subtasks stay quick and lean.