# Thermal Guard Daemon

The thermal guard daemon watches local temperatures and top CPU consumers, logs snapshots/actions, and applies targeted mitigations when configured rules trip.

## Files

- Runtime: `/home/david/random/bin/thermal_guard_daemon.py`
- User config: `/home/david/.config/thermal-guard/config.json`
- User unit: `/home/david/.config/systemd/user/thermal-guard.service`
- Latest snapshot: `~/.local/state/thermal-guard/latest.json`
- Event log: `~/.local/state/thermal-guard/events.jsonl`

## Commands

```bash
python3 /home/david/random/bin/thermal_guard_daemon.py once
python3 /home/david/random/bin/thermal_guard_daemon.py status
python3 /home/david/random/bin/thermal_guard_daemon.py daemon
python3 /home/david/random/bin/thermal_guard_daemon.py daemon --dry-run
```

## Rule Model

Each rule can match by command regex, temperature source, temperature threshold, minimum CPU percent, required consecutive breaches, cooldown, and action.

Supported actions:

- `signal`
- `signal_group`
- `systemctl_stop_user`

## Default Rules

The default config targets the specific classes of runaway jobs that recently caused thermal spikes:

- `sol_ingest.py ... build`
- `ollama runner --ollama-engine`
- headless Playwright browser jobs
- `sol-chat-vision-fast.service` via its Gemma fast-vision backend

The defaults intentionally do not kill broader long-running services like `masterbot.service`; add those only if you explicitly want thermal guard to stop them.
