# Sol-37 Activity Analytics Report

Generated: 2026-06-10 14:22 CDT

Sources:

- `/home/david/random/www/site-metrics.json`, generated at `2026-06-10T19:21:26.290697+00:00`
- `/tmp/pkd_caddy_access.log`, parsed through `2026/06/10 19:22:29.732`
- Live endpoint smoke checks against `http://127.0.0.1:8888` and `https://sol.system42.one`
- Browser pass against `https://sol.system42.one/gui`

## Executive Summary

Sol-37 is currently functional and active. The public site, local Caddy origin, chat API, dashboard API, push activity API, logbook API, metrics files, knowledge index, sitemap, and GUI shell all responded successfully during verification.

Traffic is dominated by live UI polling and dashboard/runtime endpoints, not by ordinary static page browsing. The two largest request classes are `/api/chat/broadcast` and `/push/api/activity`; together they account for roughly 78% of the raw Caddy requests currently on disk. This is expected for an open GUI shell with repeated status polling.

Recent bot activity is significant. Googlebot is actively crawling GUI-wrapped pages, historical DaveDot assets, chat history routes, and even posting page-summary prompts to `/api/chat`. Several long-running Googlebot `/api/chat` POSTs reached about 125 seconds and ended with Caddy status `0`, which appears to be client disconnect or timeout behavior rather than a current service outage.

## Current Health Snapshot

- Local root page: `200 OK`, 338,893 bytes
- Public root page: `200 OK`, 338,893 bytes
- Browser GUI load: title `Sol-37 • Retro UI`, `document.readyState=complete`
- Browser console: 0 warnings, 0 errors
- Browser GUI network requests observed: chat history, broadcast status, push activity, posts index, site index, knowledge index, and metrics all returned `200`
- Camera snapshot: `200 OK`, PNG, 324x240, 44,044 bytes
- Camera stream: `200 OK`, MJPEG stream, 326,325 bytes received before the intentional 5s timeout

Running services observed:

- `caddy-sol37.service`
- `sol-chat-api.service`
- `sol-chat-model.service`
- `sol-chat-vision-fast.service`
- `local-dashboard-api.service`
- `public-logbook-api.service`
- `public-logbook-irc-logger.service`
- `site-index-watch.service`
- `sol-observer.service`
- `sol37-gui-share.service`
- `video-playlist-watch.service`

## Cumulative Site Counters

These counters come from `site-metrics.json` and include recovered historical state.

- Host total hits: 294,088
- Visible total hits: 247,424
- Suppressed hits: 46,664
- Unique IPs: 1,729
- External hits: 184,790
- External unique IPs: 1,724
- Hits today: 2,766
- Hits last 24h: 3,883

Current metrics window:

- Window start: `2026-06-09T01:40:18.382000-05:00`
- Window end: `2026-06-10T14:20:47.730000-05:00`
- Window visible hits: 5,604
- Window host hits: 5,622
- Window suppressed hits: 18

Suppressed paths:

- `/site-metrics.json`: 33,608
- `/api/logbook/messages`: 12,097
- `/posts/index.json`: 879
- `/site-metrics.html`: 83

## Top Routes

Lifetime/top visible routes:

| Route | Hits |
| --- | ---: |
| `/api/chat/broadcast` | 78,674 |
| `/push/api/activity` | 70,982 |
| `/api/dashboard/status` | 19,601 |
| `/api/dashboard/services/console` | 11,617 |
| `/api/dashboard-access/status` | 11,095 |
| `/api/dashboard/comments` | 9,409 |
| `/api/dashboard/camera.mjpg` | 7,599 |
| `/api/chat/health` | 7,588 |
| `/api/knowledge/health` | 6,871 |
| `/api/state/redblue` | 2,786 |
| `/api/chat` | 2,202 |
| `/gui` | 1,606 |

Top external routes:

| Route | External Hits |
| --- | ---: |
| `/api/chat/broadcast` | 61,362 |
| `/push/api/activity` | 52,504 |
| `/api/dashboard/status` | 13,491 |
| `/api/dashboard/services/console` | 7,863 |
| `/api/dashboard-access/status` | 7,430 |
| `/api/dashboard/comments` | 6,388 |
| `/api/dashboard/camera.mjpg` | 5,559 |
| `/api/chat/health` | 5,471 |

Raw current Caddy log top routes:

| Route | Requests |
| --- | ---: |
| `/api/chat/broadcast` | 2,208 |
| `/push/api/activity` | 2,203 |
| `/api/chat` | 118 |
| `/api/chat/history` | 68 |
| `/gui` | 64 |
| `/robots.txt` | 18 |
| `/site-metrics.json` | 10 |
| `/api/knowledge/query` | 7 |
| `/api/state` | 6 |
| `/sol37/index.html` | 6 |
| `/programs/dosbox.html` | 6 |
| `/chat/index.html` | 6 |
| `/posts/timeline.html` | 6 |

## Traffic Sources

Source classification from `site-metrics.json`:

| Source Type | Hits |
| --- | ---: |
| external browser | 177,756 |
| self | 62,555 |
| other bot | 4,604 |
| meta crawler | 1,074 |
| facebook crawler | 858 |
| unknown | 498 |
| internal lan | 79 |

Raw user-agent classes in the current Caddy log:

| Class | Requests |
| --- | ---: |
| browser | 4,446 |
| googlebot | 1,152 |
| curl/self-check | 20 |
| python-urllib/3.10 | 13 |
| iOS NetworkingExtension | 5 |
| facebook | 4 |
| Meta WebIndexer | 1 |

Top IPs in the current Caddy log:

| IP | Requests | Notes |
| --- | ---: | --- |
| `2600:100c:a211:a0a0:89a0:a1cf:5cf0:c605` | 1,800 | browser-class external client |
| `10.0.1.89` | 1,568 | self/internal checks and polling |
| `2600:100c:a211:a0a0:4e15:24df:7015:68a1` | 877 | browser-class external client |
| `66.249.79.133` | 392 | Googlebot |
| `66.249.79.131` | 343 | Googlebot |
| `66.249.79.134` | 192 | Googlebot |
| `66.249.79.132` | 187 | Googlebot |
| `132.147.145.129` | 120 | external |

## Time Pattern

Daily hits from the metrics snapshot:

| Date | Hits |
| --- | ---: |
| 2026-06-10 | 2,766 |
| 2026-06-09 | 3,039 |
| 2026-06-08 | 1,689 |
| 2026-06-07 | 263 |
| 2026-06-06 | 4,515 |
| 2026-06-05 | 2,886 |
| 2026-06-04 | 2,948 |

Hourly pattern in the raw Caddy log is steady around 120-145 requests/hour for much of the window, with clear spikes:

- `2026-06-10 07`: 430 requests
- `2026-06-10 08`: 571 requests
- `2026-06-10 16`: 165 requests
- `2026-06-10 17`: 183 requests
- `2026-06-10 18`: 156 requests

The 07:00-08:00 CDT spike is the clearest burst in the current raw log. The 16:00-18:00 CDT lift coincides with recent crawler and GUI activity, including Googlebot interactions with GUI page contexts.

## Status And Reliability

Status distribution from `site-metrics.json`:

| Status | Hits |
| --- | ---: |
| 200 | 242,633 |
| 206 | 1,156 |
| 502 | 1,070 |
| 304 | 992 |
| 404 | 747 |
| 0 | 746 |
| 308 | 22 |
| 400 | 16 |

Status distribution in the current raw Caddy log:

| Status | Requests |
| --- | ---: |
| 200 | 5,556 |
| 0 | 53 |
| 304 | 26 |
| 404 | 5 |
| 308 | 1 |

Current-window 5xx check:

- No 5xx responses were found in the last 300 raw Caddy access log entries.
- The current raw log parse found no `502` entries, even though recovered metrics include 1,070 historical `502` hits.

Current error-like paths in the raw log:

| Path | Status Pattern |
| --- | --- |
| `/api/chat` | 65 successful `200`, 53 status `0` |
| `/apple-touch-icon.png` | 404 |
| `/apple-touch-icon-precomposed.png` | 404 |
| `/davedot/assets/ppt/Dave` | 404 |
| `/davedot/assets/original/wwwhome/DaveDot%20Search%20Toolbar%20-%20Download.txt` | 404 |
| `/api/chat/delete` | 404 |

The `status=0` entries are concentrated on `/api/chat` and are all slow, about 125 seconds. The slowest entries are Googlebot POST requests. This points to crawler-triggered long-running chat requests where the client disconnects or the gateway times out, not broad application failure.

## Bandwidth-Heavy Content

Top response-byte paths in the current raw Caddy log:

| Path | Bytes Served |
| --- | ---: |
| `/push/api/activity` | 16,425,823 |
| `/davedot/assets/ppt-previews/index.pdf` | 10,407,223 |
| `/davedot/assets/ppt/Dave%20Dot%20v.2.ppt` | 8,704,000 |
| `/davedot/assets/ppt/Dave%20Dot%20v.4.ppt` | 8,551,936 |
| `/davedot/assets/ppt/index.ppt` | 7,790,080 |
| `/gui` | 4,824,017 |
| `/davedot/assets/ppt/Dave%20Dot%20v.pps` | 4,390,912 |
| `/generated/dosbox/c-drive/PROGRAMS/SESSION/SESSION.EXE` | 2,234,880 |
| `/davedot/assets/ppt/Dave%20Dot%20v.3.ppt` | 1,995,264 |
| `/unified-clock/app/assets/index.js` | 1,197,589 |
| `/assets/share-previews/featured-silver-creek-html-d27db4d1.png` | 1,150,426 |
| `/` | 903,791 |

The top bandwidth item is not media; it is repeated JSON activity polling. Historical DaveDot PowerPoint and PDF assets are also a meaningful bandwidth component, largely because crawlers are actively discovering them.

## Recent Activity Narrative

Recent activity shows Googlebot crawling GUI-wrapped pages and then following through to underlying assets. Examples from the metrics `recent_requests` and access log include:

- `/gui?i=davedot%2Fassets%2Foriginal%2Fwwwhome%2Fbanner.html`
- `/api/chat/history?session=...`
- `/davedot/assets/original/wwwhome/banner.html`
- `/davedot/assets/original/wwwhome/banner.html?ts=1780704000000`
- `/api/chat` POST requests asking for page summaries
- DaveDot PowerPoint and PDF preview assets

This means the recent "local models thinking" observation is not isolated: crawlers are causing the public GUI and assistant pathways to produce page-summary work. Some of that work succeeds quickly; some long-running chat posts hit the 125-second edge and end with status `0`.

## Operational Observations

- The site is currently up on both local and public routes.
- Polling endpoints dominate request count and should be treated separately from content popularity.
- Suppressed metrics are doing useful work: without suppression, `/site-metrics.json` and logbook polling would distort visible activity.
- Current public API functionality is healthy, but `/api/chat` is exposed to crawler-driven POST load through GUI flows.
- The historical `502` count exists in recovered metrics, but current raw log evidence points more strongly at crawler timeouts/status `0` than active backend failure.
- Browser-rendered GUI behavior is clean at the time of this report: no console warnings/errors and all observed front-page API calls returned `200`.

## Recommendations

1. Keep `/api/chat/broadcast` and `/push/api/activity` in a separate "polling/control-plane" analytics bucket so content analytics are not swamped by idle GUI refresh traffic.
2. Add or tune crawler handling for assistant POST flows. Googlebot is exercising `/api/chat` and can create long-running model work; consider serving crawler-safe static summaries or disabling assistant POST auto-runs for known crawler user agents.
3. Add apple touch icon files or a deliberate Caddy response for `/apple-touch-icon.png` and `/apple-touch-icon-precomposed.png` to remove harmless 404 noise.
4. Review the DaveDot legacy asset paths that Googlebot is crawling. They are working in most cases, but the `/davedot/assets/ppt/Dave` 404 suggests at least one encoded-space or truncated legacy URL path may still need a redirect.
5. Track `status=0` separately from HTTP errors. In this data, status `0` is more diagnostic of client disconnects/timeouts than server-side failure.
6. Consider a compact analytics endpoint or generated report that splits traffic into buckets: control-plane polling, assistant/model work, crawlers, static archive assets, media, and human GUI sessions.

## Follow-Up Implementation

Implemented after this report on 2026-06-10:

- Added `SOL_CHAT_CRAWLER_SAFE_MODE`, default enabled, in `/home/david/random/bin/sol_chat_api.py`.
- Added crawler user-agent detection for common search, social, and preview crawlers.
- For crawler user agents, `POST /api/chat` and `GET /api/chat/query` now return static page-context fallback text with `crawler_safe: true`, `persisted: false`, and `cache_match: "crawler_safe"` instead of running live model generation.
- For crawler user agents, `GET /api/chat/history` now returns an empty message/debug payload with `crawler_safe: true`, avoiding transcript exposure to crawlers.
- Verified through the public route `https://sol.system42.one/api/chat` using a Googlebot user agent. The response returned immediately with static page-context text and did not persist history.

## Verification Commands Used

```bash
curl -sS -D - -o /tmp/sol37-root-local.html http://127.0.0.1:8888/
curl -sS -D - -o /tmp/sol37-root-public.html https://sol.system42.one/
curl -sS http://127.0.0.1:8888/api/chat/health
curl -sS https://sol.system42.one/api/chat/health
curl -sS http://127.0.0.1:8888/api/dashboard/status
curl -sS https://sol.system42.one/api/dashboard/status
curl -sS http://127.0.0.1:8888/push/api/activity?limit=5
curl -sS https://sol.system42.one/push/api/activity?limit=5
curl -sS http://127.0.0.1:8888/api/logbook/messages?channel=public-logbook&limit=3
curl -sS https://sol.system42.one/api/logbook/messages?channel=public-logbook&limit=3
bash /home/david/.codex/skills/playwright/scripts/playwright_cli.sh open https://sol.system42.one/gui
bash /home/david/.codex/skills/playwright/scripts/playwright_cli.sh console warning
bash /home/david/.codex/skills/playwright/scripts/playwright_cli.sh requests
```
