Vol. I · On the Depth of Thought

How hard
should it think?

Before GPT-5.4 answers, it can spend tokens reasoning. You choose how deep it goes — six dials, each with a price.

Reading · medium

Six settings, one spectrum

Cool → Hot

— 01

noneStraight to the answer. Zero thinking tokens.

off

fastest · —

— 02

minimalA breath of hesitation. Formatting, trivial questions.

warm

quick · ~1×

— 03

lowBrief plan before answering. Summaries, small edits.

amber

light · ~2×

— 04

mediumBalanced default. Coding, analysis, most daily work.

glow

steady · ~4×

— 05

highWorks the problem. Multi-step logic, tricky debugging.

hot

slow · ~8×

— 06

xhighMaximum depth. Architecture, long chains, subtle bugs.

molten

deepest · ~16×

Four instruments on the console

Where you turn it up

Main dial

Hermes, itself.

agent.reasoning_effort

Sets the reasoning depth for every turn the main agent takes.

none minimal low medium high xhigh

Subagents

Delegated thinking.

delegation.reasoning_effort

A separate dial for subagents Hermes spawns. Cheaper — or deeper — than the main.

inherit or override

Visibility

Show the work.

display.show_reasoning

Prints the reasoning trace alongside the answer. Free — those tokens are already paid.

false true

Mid-chat

Live override.

/reasoning high

Bump, lower, or hide reasoning inside a live session. No restart, no edit.

/reasoning high /reasoning show /reasoning none

III

What you give, what you get

The exchange

— every level is a trade —

fast

slow

cheap

expensive

shallow

deep

light quota

weekly cap

Three prescriptions

Paste & restart

Rx · A

Think harder, always.

# config.yaml agent: reasoning_effort: high

Everyday default shifts up. Better on coding, analysis, planning.

Rx · B

Max depth, show work.

agent: reasoning_effort: xhigh display: show_reasoning: true

Hardest problems. Reasoning trace visible. Burns quota faster.

Rx · C

Smart main, cheap subs.

agent: reasoning_effort: high delegation: reasoning_effort: low

Main agent thinks deep. Delegated subtasks stay quick and lean.