# Sol-37 Site Audit

Date: 2026-04-17
Repo: `/home/david/random`
Site root: `/home/david/random/www`
Primary host: `sol.system42.one`
Local origin: `http://127.0.0.1:8888`

## Scope

This is the committed audit summary for the Sol-37 site as inspected on 2026-04-16. It is intended to match the current in-tree operator documentation in `www/README.md` and the browser manual in `www/site-server.html`.

The key point remains unchanged:

- Sol-37 is mostly static files
- the visible shell is only one layer
- the operational site depends on a small local service stack

## Current Public Shape

Primary public surfaces:

- `/`
  main shell
- `/gui`
  direct shell deep-link entry
- `/chat`
  dedicated public Sol console
- `/dashboard`
  public AI dashboard app
- `/sitemap.html`
  archive browser
- `/site-metrics.html`
  traffic monitor
- `/programs/logbook.html`
  public logbook
- `/programs/media-player.html`
  media surface

Primary API routes:

- `/api/chat/*`
- `/api/dashboard/*`
- `/api/dashboard-access/*`
- `/api/knowledge/*`
- `/api/logbook/*`

The local site origin is still Caddy on `127.0.0.1:8888`.

## Frontend Architecture

### Shell

`www/index.html` is the primary shell and still acts as a single-file window manager:

- desktop icons
- Start menu
- taskbar/task buttons
- draggable windows
- iframe-backed content/program surfaces
- shareable `/gui?i=...` links
- built-in floating Sol assistant orb

The shell is coherent, but the tradeoff remains obvious: the central interaction model is easy to deploy and hard to modularize.

### Dedicated Chat

`www/chat/` is now a first-class client rather than a placeholder.

Important current behavior:

- same-origin `/api/chat`
- streamed text over SSE
- streamed or cached voice playback through `/api/chat/speak`
- orb presence card aligned with the desktop assistant
- client-side diagnostics for transport/latency/cache state

### Dashboard App

`www/dashboard/` is now a first-class surface published from the main desktop shell.

Current behavior:

- launched from a new desktop icon and Start menu entry in `www/index.html`
- same-origin `/api/dashboard/*` for camera, caption, and service data
- same-origin `/api/dashboard-access/*` for public enable/disable state
- local and LAN clients keep access regardless of the public toggle
- public clients can only use the live dashboard when the access daemon enables it

### Assistant Surfaces

There are two real assistant clients:

1. floating desktop assistant in `index.html`
2. dedicated `/chat` client in `www/chat/`

Both now share the same broad runtime model:

- persistent session id
- same-origin chat backend
- streamed chat deltas where streaming is appropriate
- optional voice playback
- retrieval grounding
- same-origin routes

## Assistant And Voice Stack

### Chat Flow

Current request flow:

1. Browser sends `POST /api/chat`
2. Caddy proxies to `sol_chat_api.py`
3. `sol_chat_api.py`:
   - updates session history
   - fetches retrieval context from the knowledge API
   - builds grounded prompt state
   - queries the local model backend on `127.0.0.1:18080`
4. The transport shape depends on client and context:
   - `/chat` uses SSE events:
     - `status`
     - `delta`
     - `done`
     - `error`
   - the floating desktop assistant now prefers one-shot JSON for page-grounded turns and keeps SSE for non-page/freeform turns
5. The desktop assistant retries without stream if a streamed turn faults or produces no usable text
6. Browser updates the visible transcript either incrementally or from the final JSON payload

### Desktop Assistant Context Behavior

The desktop assistant supports two separate context strategies:

- compact page context for prompt grounding
- full extracted page text for read-aloud behavior

This separation is important and now explicit in the codebase. Using the same bounded context for narration caused incomplete read-aloud behavior.

The compact prompt context is now deliberately smaller than before, and the desktop assistant fingerprints page state from the target, title, content type, headings, and bounded content. That fingerprint is used for client-side caching of:

- grounded page replies
- generated read-aloud narration text

Repeated page questions now reuse the cached reply until the page content changes. Repeated read-aloud requests reuse the cached narration text until the page content changes.

### Voice Flow

Current low-latency voice flow:

1. Browser identifies a speakable chunk
2. Browser points its `<audio>` element at `GET /api/chat/speak?text=...`
3. API either:
   - streams cached MP3 bytes from disk, or
   - streams fresh ElevenLabs output via `11speak.py`
4. Browser starts playback before the whole audio file is finished

This is the significant current improvement over the older blob-based path.

### Current Voice Caching

Speech cache location:

- `~/.local/share/sol_chat_web/tts_cache`

Behavior:

- repeated identical text reuses cached MP3 by content hash
- uncached text still streams while being generated
- streamed chat replies can start speaking from completed sentence chunks before the full reply finishes
- when the browser reuses cached narration text, the server-side TTS cache also reuses the previously generated MP3

Practical result:

- earlier first audio
- less “dead air” after the model begins replying
- no need to wait for full MP3 materialization before browser playback

### Desktop Playback Controls

The floating assistant now has a real transport model instead of a fire-and-forget orb:

- explicit `Play` / `Pause` / `Resume` button labeling based on current playback state
- hard stop/reset when the assistant is closed
- queued speech will not resume after close
- orb click now toggles or resumes playback instead of blindly restarting speech

### Prompt Suggestions

The floating assistant still exposes three quick prompts, but they no longer behave like three static presets:

- the first suggestion stays anchored
- the second and third are rerolled from a broader prompt pool each time the popup opens
- page-grounded prompt pools are generated from the current page title, headings, and content

## HTTP And Proxy Topology

Current Caddy behavior from `bin/Caddyfile.pkd_share`:

- `/api/logbook/*` -> `127.0.0.1:8890`
- `/api/knowledge/*` -> `127.0.0.1:8892`
- `/api/chat` and `/api/chat/*` -> `127.0.0.1:8895`
- `/api/dashboard/*` -> `127.0.0.1:8896`
- `/api/dashboard-access/*` -> `127.0.0.1:8897`
- `/chat` rewritten to `/chat/index.html`
- `/dashboard` rewritten to `/dashboard/index.html`
- `/chat` and `/chat/*` marked `Cache-Control: no-store`
- `/dashboard`, `/dashboard/*`, and `/api/dashboard-access/*` marked `Cache-Control: no-store`
- `/gui` -> `127.0.0.1:8893`
- markdown routes bridged through `127.0.0.1:8894`

Current caching semantics:

- chat assets intentionally non-cacheable
- audio assets long-lived/immutable
- images/video/static assets short-lived public cache
- streamed voice route explicitly `no-store`

## Backend Services

Observed relevant user services:

- `caddy-sol37.service`
- `sol-chat-api.service`
- `sol-chat-model.service`
- `local-dashboard-api.service`
- `dashboard-access-daemon.service`
- `knowledge-query-api.service`
- `sol37-gui-share.service`
- `site-index-watch.service`
- `video-playlist-watch.service`
- `audio-playlist-watch.service`
- `public-logbook-api.service`
- `public-logbook-irc-logger.service`

Observed relevant timers:

- `site-metrics-snapshot.timer`
- `synthetic-logbook-snapshot.timer`

Observed system service dependency:

- `ngircd.service`

Boot persistence:

- `Linger=yes` for user `david`

That remains operationally critical. Without linger, the site stack is only a logged-in desktop illusion.

## Generated State

Generated artifacts with operational significance:

- `www/site-index.json`
- `www/knowledge-index.json`
- `www/site-metrics.json`
- `www/audio/playlist.json`
- `www/video/playlist.json`
- `www/assets/share-previews/*`

These are not incidental build outputs. They materially affect browser-visible behavior.

## Current Strengths

- coherent identity between shell, `/chat`, and assistant surfaces
- same-origin route design simplifies browser behavior and deployment
- retrieval, chat, voice, metrics, and share metadata are now documented as one system instead of isolated features
- streamed text and streamed voice materially reduce perceived latency
- page-local caching now reduces repeated summary and narration latency inside the desktop shell

## Current Risks

- `index.html` is still a large maintenance hotspot
- runtime truth is split across repo content, generated state, and local daemons
- media playback remains the least predictable subsystem under public network conditions
- chat freshness still depends on preserving `no-store` semantics
- assistant features are easy to improve locally and easy to leave undocumented unless the docs are maintained deliberately
- desktop assistant behavior now depends on both browser-side caches and server-side TTS cache, so regressions may hide until a cold-cache test is run

## Verification Commands

Current useful checks:

```bash
git -C /home/david/random status --short
curl -I -s http://127.0.0.1:8888/
curl -s http://127.0.0.1:8895/api/chat/health
curl -s http://127.0.0.1:8896/api/dashboard/status
curl -s http://127.0.0.1:8897/api/dashboard-access/status
curl -sD - -o /tmp/test.mp3 'http://127.0.0.1:8895/api/chat/speak?text=hello&session=test'
file /tmp/test.mp3
curl -s http://127.0.0.1:8892/health | jq
curl -s 'http://127.0.0.1:8890/messages?channel=public-logbook&limit=3'
systemctl --user --type=service --state=running | rg -i 'caddy|sol-chat|knowledge|logbook|site|video|audio|sol37'
systemctl --user list-timers --all | rg 'site-metrics|synthetic'
ss -ltnp | rg '(:8888|:8890|:8892|:8893|:8895|:8896|:8897|:18080)'
python3 /home/david/random/bin/check_sol_chat_asset_versioning.py
```

## Audit Conclusion

Sol-37 is still best understood as a hybrid archive system:

- static tree first
- live daemons where needed
- same-origin browser clients over local services
- generated artifacts treated as part of the product surface

The most important delta since the earlier audit is the maturity of the assistant/chat runtime. The assistant is no longer a passive shell flourish. It is now a genuine multi-surface client with:

- shared same-origin chat APIs
- page-aware grounding
- explicit read-aloud behavior
- mixed reply transport chosen by surface/context
- non-stream fallback for empty or failed desktop streams
- client-side page and narration reuse
- streamed or cached voice playback

That change needed documentation. This audit now treats it as first-order architecture rather than a small UI feature.