feat: run at boot (systemd/launchd) and auth token re-sync docs

- Add systemd unit examples and README for local install (Linux)
- Extend deployment README with Run at boot (local install) and merge upstream
- Add Re-syncing auth tokens subsection to gateway provisioning troubleshooting
- install.sh: add --install-service to install systemd user units (Linux)
- DOCUMENTATION.md: session notes

Made-with: Cursor
This commit is contained in:
Claude Thebot
2026-03-09 22:25:31 -07:00
parent f7932d962a
commit efee334843
8 changed files with 303 additions and 0 deletions

33
DOCUMENTATION.md Normal file
View File

@@ -0,0 +1,33 @@
# Session documentation
Decisions and changes made during development.
## 2026-03-09: Run at boot and auth token re-sync
### Goal
- Allow Mission Control (local install, no Docker, e.g. in a VM) to run at boot via systemd (Linux) or launchd (macOS).
- Document how to re-sync auth tokens between Mission Control and OpenClaw when they have drifted.
### Implemented
1. **Systemd unit files** (`docs/deployment/systemd/`)
- Added example units: `openclaw-mission-control-backend.service`, `openclaw-mission-control-frontend.service`, `openclaw-mission-control-rq-worker.service`.
- Units use placeholders `REPO_ROOT`, `BACKEND_PORT`, `FRONTEND_PORT`; install instructions and a small README explain substitution and install to `~/.config/systemd/user/` or `/etc/systemd/system/`.
- RQ worker is required for gateway lifecycle and webhooks; it is a separate unit.
2. **Deployment docs** (`docs/deployment/README.md`)
- Replaced placeholder with a short deployment guide.
- "Run at boot (local install)": Linux (systemd) with link to `systemd/README.md`; macOS (launchd) with example plist and `launchctl load`; Docker Compose note for `restart: unless-stopped`.
3. **Troubleshooting** (`docs/troubleshooting/gateway-agent-provisioning.md`)
- New subsection "Re-syncing auth tokens when Mission Control and OpenClaw have drifted": when tokens drift, run template sync with `rotate_tokens=true` via API (curl) or CLI (`scripts/sync_gateway_templates.py --rotate-tokens`); after sync, wake/update gateway if needed.
4. **install.sh** (`install.sh`)
- New optional flag `--install-service` (local mode only): on Linux, copies the three systemd unit files from `docs/deployment/systemd/`, substitutes `REPO_ROOT`/ports, installs to `$XDG_CONFIG_HOME/systemd/user` (or `~/.config/systemd/user`), runs `systemctl --user daemon-reload` and `systemctl --user enable`. On non-Linux, prints a note pointing to `docs/deployment/README.md` for launchd. Not prompted by default; only when the user passes `--install-service`.
### Rationale
- **No Docker in VM**: User runs local install in a VM and does not want Docker there; run-at-boot is provided by the OS (systemd/launchd).
- **Units as examples**: Units are in `docs/deployment/systemd/` so they can be versioned and copied; install.sh only installs when `--install-service` is given to avoid touching system/LaunchAgents without explicit opt-in.
- **Auth re-sync**: Token drift is a common failure mode; documenting the API and CLI with `rotate_tokens=true` in the provisioning troubleshooting doc makes recovery easy to find.

View File

@@ -50,6 +50,8 @@ Open:
- Frontend: `http://localhost:${FRONTEND_PORT:-3000}` - Frontend: `http://localhost:${FRONTEND_PORT:-3000}`
- Backend health: `http://localhost:${BACKEND_PORT:-8000}/healthz` - Backend health: `http://localhost:${BACKEND_PORT:-8000}/healthz`
To have containers restart on failure and after host reboot, add `restart: unless-stopped` to the `db`, `redis`, `backend`, and `frontend` services in `compose.yml`, and ensure Docker is configured to start at boot.
### 3) Verify ### 3) Verify
```bash ```bash
@@ -112,3 +114,65 @@ Typical setup (outline):
- Ensure the frontend can reach the backend over the configured `NEXT_PUBLIC_API_URL` - Ensure the frontend can reach the backend over the configured `NEXT_PUBLIC_API_URL`
This section is intentionally minimal until we standardize a recommended proxy (Caddy/Nginx/Traefik). This section is intentionally minimal until we standardize a recommended proxy (Caddy/Nginx/Traefik).
## Run at boot (local install)
If you installed Mission Control **without Docker** (e.g. using `install.sh` with "local" mode, or inside a VM where Docker is not used), the installer does not configure run-at-boot. You can start the stack after each reboot manually, or configure the OS to start it for you.
### Linux (systemd)
Use the example systemd units and instructions in [systemd/README.md](./systemd/README.md). In short:
1. Copy the unit files from `docs/deployment/systemd/` and replace `REPO_ROOT`, `BACKEND_PORT`, and `FRONTEND_PORT` with your paths and ports.
2. Install the units under `~/.config/systemd/user/` (user) or `/etc/systemd/system/` (system).
3. Enable and start the backend, frontend, and RQ worker services.
The RQ queue worker is required for gateway lifecycle (wake/check-in) and webhook delivery; run it as a separate unit.
### macOS (launchd)
Use LaunchAgents so the backend, frontend, and worker run under your user and restart on failure.
1. Create a plist for each process under `~/Library/LaunchAgents/`, e.g. `com.openclaw.mission-control.backend.plist`:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.openclaw.mission-control.backend</string>
<key>ProgramArguments</key>
<array>
<string>/usr/bin/env</string>
<string>uv</string>
<string>run</string>
<string>uvicorn</string>
<string>app.main:app</string>
<string>--host</string>
<string>0.0.0.0</string>
<string>--port</string>
<string>8000</string>
</array>
<key>WorkingDirectory</key>
<string>REPO_ROOT/backend</string>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/usr/local/bin:/opt/homebrew/bin:REPO_ROOT/backend/.venv/bin</string>
</dict>
<key>KeepAlive</key>
<true/>
<key>RunAtLoad</key>
<true/>
</dict>
</plist>
```
Replace `REPO_ROOT` with the actual repo path. Ensure `uv` is on `PATH` (e.g. add `~/.local/bin` to the `PATH` in the plist). Load with:
```bash
launchctl load ~/Library/LaunchAgents/com.openclaw.mission-control.backend.plist
```
2. Add similar plists for the frontend (`npm run start -- --hostname 0.0.0.0 --port 3000` in `REPO_ROOT/frontend`) and for the RQ worker (`uv run python ../scripts/rq worker` with `WorkingDirectory=REPO_ROOT/backend` and `ProgramArguments` pointing at `uv`, `run`, `python`, `../scripts/rq`, `worker`).

View File

@@ -0,0 +1,58 @@
# Systemd unit files (local install, run at boot)
Example systemd units for running Mission Control at boot when installed **without Docker** (e.g. local install in a VM).
## Prerequisites
- **Backend**: `uv`, Python 3.12+, and `backend/.env` configured (including `DATABASE_URL`, `RQ_REDIS_URL` if using the queue worker).
- **Frontend**: Node.js 22+ and `frontend/.env` (e.g. `NEXT_PUBLIC_API_URL`).
- **RQ worker**: Redis must be running and reachable; `backend/.env` must set `RQ_REDIS_URL` and `RQ_QUEUE_NAME` to match the backend API.
If you use Docker only for Postgres and/or Redis, start those first (e.g. `docker compose up -d db` and optionally Redis) or add `After=docker.service` and start the stack via a separate unit or script.
## Placeholders
Before installing, replace in each unit file:
- `REPO_ROOT` — absolute path to the Mission Control repo (e.g. `/home/user/openclaw-mission-control`).
- `BACKEND_PORT` — backend port (default `8000`).
- `FRONTEND_PORT` — frontend port (default `3000`).
Example (from repo root):
```bash
REPO_ROOT="$(pwd)"
for f in docs/deployment/systemd/openclaw-mission-control-*.service; do
sed -e "s|REPO_ROOT|$REPO_ROOT|g" -e "s|BACKEND_PORT|8000|g" -e "s|FRONTEND_PORT|3000|g" "$f" \
-o "$(basename "$f")"
done
# Then copy the generated .service files to ~/.config/systemd/user/ or /etc/systemd/system/
```
## Install and enable
**User units** (recommended for single-user / VM):
```bash
cp openclaw-mission-control-backend.service openclaw-mission-control-frontend.service openclaw-mission-control-rq-worker.service ~/.config/systemd/user/
systemctl --user daemon-reload
systemctl --user enable openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker
systemctl --user start openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker
```
**System-wide** (e.g. under `/etc/systemd/system/`):
```bash
sudo cp openclaw-mission-control-*.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker
```
## Order
Start order is not strict between backend, frontend, and worker; all use `After=network-online.target`. Ensure Postgres (and Redis, if used) are running before or with the backend/worker (e.g. start Docker services first, or use system units for Postgres/Redis with the Mission Control units depending on them).
## Logs
- `journalctl --user -u openclaw-mission-control-backend -f` (or `sudo journalctl -u openclaw-mission-control-backend -f` for system units)
- Same for `openclaw-mission-control-frontend` and `openclaw-mission-control-rq-worker`.

View File

@@ -0,0 +1,23 @@
# Mission Control backend (FastAPI) — example systemd unit for local install.
# Copy to ~/.config/systemd/user/ or /etc/systemd/system/, then:
# sed -e 's|REPO_ROOT|/path/to/openclaw-mission-control|g' -e 's|BACKEND_PORT|8000|g' -i openclaw-mission-control-backend.service
# systemctl --user daemon-reload # or sudo systemctl daemon-reload
# systemctl --user enable --now openclaw-mission-control-backend # or sudo systemctl enable --now ...
#
# Requires: uv in PATH (e.g. ~/.local/bin), backend/.env present.
[Unit]
Description=Mission Control backend (FastAPI)
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=REPO_ROOT/backend
EnvironmentFile=-REPO_ROOT/backend/.env
ExecStart=uv run uvicorn app.main:app --host 0.0.0.0 --port BACKEND_PORT
Restart=on-failure
RestartSec=5
[Install]
WantedBy=default.target

View File

@@ -0,0 +1,23 @@
# Mission Control frontend (Next.js) — example systemd unit for local install.
# Copy to ~/.config/systemd/user/ or /etc/systemd/system/, then:
# sed -e 's|REPO_ROOT|/path/to/openclaw-mission-control|g' -e 's|FRONTEND_PORT|3000|g' -i openclaw-mission-control-frontend.service
# systemctl --user daemon-reload # or sudo systemctl daemon-reload
# systemctl --user enable --now openclaw-mission-control-frontend # or sudo systemctl enable --now ...
#
# Requires: Node.js/npm in PATH (e.g. from nvm or system install), frontend/.env present.
[Unit]
Description=Mission Control frontend (Next.js)
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=REPO_ROOT/frontend
EnvironmentFile=-REPO_ROOT/frontend/.env
ExecStart=npm run start -- --hostname 0.0.0.0 --port FRONTEND_PORT
Restart=on-failure
RestartSec=5
[Install]
WantedBy=default.target

View File

@@ -0,0 +1,24 @@
# Mission Control RQ queue worker — example systemd unit for local install.
# Processes lifecycle and webhook queue tasks; required for gateway wake/check-in and webhooks.
# Copy to ~/.config/systemd/user/ or /etc/systemd/system/, then:
# sed -e 's|REPO_ROOT|/path/to/openclaw-mission-control|g' -i openclaw-mission-control-rq-worker.service
# systemctl --user daemon-reload # or sudo systemctl daemon-reload
# systemctl --user enable --now openclaw-mission-control-rq-worker # or sudo systemctl enable --now ...
#
# Requires: uv in PATH, Redis reachable (RQ_REDIS_URL in backend/.env), backend/.env present.
[Unit]
Description=Mission Control RQ queue worker
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=REPO_ROOT/backend
EnvironmentFile=-REPO_ROOT/backend/.env
ExecStart=uv run python ../scripts/rq worker
Restart=on-failure
RestartSec=5
[Install]
WantedBy=default.target

View File

@@ -104,3 +104,29 @@ Actions:
- gateway logs around bootstrap - gateway logs around bootstrap
- worker logs around lifecycle events - worker logs around lifecycle events
- agent `last_provision_error`, `wake_attempts`, `last_seen_at` - agent `last_provision_error`, `wake_attempts`, `last_seen_at`
## Re-syncing auth tokens when Mission Control and OpenClaw have drifted
Mission Control stores a hash of each agents token and provisions OpenClaw by writing templates (e.g. `TOOLS.md`) that include `AUTH_TOKEN`. If the token on the gateway and the backend hash drift (e.g. after a reinstall, token change, or manual edit), heartbeats can fail with 401 and the agent may appear offline.
To re-sync:
1. Ensure Mission Control is running (API and queue worker).
2. Run **template sync with token rotation** so the backend issues new agent tokens and rewrites `AUTH_TOKEN` into the gateways agent files.
**Via API (curl):**
```bash
curl -X POST "http://localhost:8000/api/v1/gateways/GATEWAY_ID/templates/sync?rotate_tokens=true" \
-H "Authorization: Bearer YOUR_LOCAL_AUTH_TOKEN"
```
Replace `GATEWAY_ID` (from the Gateways list or gateway URL in the UI) and `YOUR_LOCAL_AUTH_TOKEN` with your local auth token.
**Via CLI (from repo root):**
```bash
cd backend && uv run python scripts/sync_gateway_templates.py --gateway-id GATEWAY_ID --rotate-tokens
```
After a successful sync, OpenClaw agents will have new `AUTH_TOKEN` values in their workspace files; the next heartbeat or bootstrap will use the new token. If the gateway was offline, trigger a wake/update from Mission Control so agents restart and pick up the new token.

View File

@@ -30,6 +30,7 @@ FORCE_LOCAL_AUTH_TOKEN=""
FORCE_DB_MODE="" FORCE_DB_MODE=""
FORCE_DATABASE_URL="" FORCE_DATABASE_URL=""
FORCE_START_SERVICES="" FORCE_START_SERVICES=""
FORCE_INSTALL_SERVICE=""
if [[ -t 0 ]]; then if [[ -t 0 ]]; then
INTERACTIVE=1 INTERACTIVE=1
@@ -131,6 +132,7 @@ Options:
--db-mode <docker|external> Local mode only --db-mode <docker|external> Local mode only
--database-url <url> Required when --db-mode external --database-url <url> Required when --db-mode external
--start-services <yes|no> Local mode only --start-services <yes|no> Local mode only
--install-service Local mode only: install systemd user units for run at boot (Linux)
-h, --help -h, --help
If an option is omitted, the script prompts in interactive mode and uses defaults in non-interactive mode. If an option is omitted, the script prompts in interactive mode and uses defaults in non-interactive mode.
@@ -220,6 +222,10 @@ parse_args() {
FORCE_START_SERVICES="$2" FORCE_START_SERVICES="$2"
shift 2 shift 2
;; ;;
--install-service)
FORCE_INSTALL_SERVICE="yes"
shift
;;
-h|--help) -h|--help)
usage usage
exit 0 exit 0
@@ -733,6 +739,45 @@ start_local_services() {
) )
} }
install_systemd_services() {
local backend_port="$1"
local frontend_port="$2"
local systemd_user_dir
systemd_user_dir="${XDG_CONFIG_HOME:-$HOME/.config}/systemd/user"
local units_dir="$REPO_ROOT/docs/deployment/systemd"
if [[ "$PLATFORM" != "linux" ]]; then
info "Skipping systemd install (not Linux). For macOS run-at-boot see docs/deployment/README.md (launchd)."
return 0
fi
if [[ ! -d "$units_dir" ]]; then
warn "Systemd units dir not found: $units_dir"
return 1
fi
for name in openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker; do
if [[ ! -f "$units_dir/$name.service" ]]; then
warn "Unit file not found: $units_dir/$name.service"
return 1
fi
done
mkdir -p "$systemd_user_dir"
for name in openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker; do
sed -e "s|REPO_ROOT|$REPO_ROOT|g" \
-e "s|BACKEND_PORT|$backend_port|g" \
-e "s|FRONTEND_PORT|$frontend_port|g" \
"$units_dir/$name.service" > "$systemd_user_dir/$name.service"
info "Installed $systemd_user_dir/$name.service"
done
if command_exists systemctl; then
systemctl --user daemon-reload
systemctl --user enable openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker
info "Systemd user units enabled. Start with: systemctl --user start openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker"
else
warn "systemctl not found; units were copied but not enabled."
fi
}
ensure_repo_layout() { ensure_repo_layout() {
[[ -f "$REPO_ROOT/Makefile" ]] || die "Missing Makefile in expected repository root: $REPO_ROOT" [[ -f "$REPO_ROOT/Makefile" ]] || die "Missing Makefile in expected repository root: $REPO_ROOT"
[[ -f "$REPO_ROOT/compose.yml" ]] || die "Missing compose.yml in expected repository root: $REPO_ROOT" [[ -f "$REPO_ROOT/compose.yml" ]] || die "Missing compose.yml in expected repository root: $REPO_ROOT"
@@ -954,6 +999,10 @@ SUMMARY
wait_for_http "http://127.0.0.1:$frontend_port" "Frontend" 120 || true wait_for_http "http://127.0.0.1:$frontend_port" "Frontend" 120 || true
fi fi
if [[ -n "$FORCE_INSTALL_SERVICE" ]]; then
install_systemd_services "$backend_port" "$frontend_port" || true
fi
cat <<SUMMARY cat <<SUMMARY
Bootstrap complete (Local mode). Bootstrap complete (Local mode).
@@ -973,6 +1022,9 @@ If services were started by this script, logs are under:
Stop local background services: Stop local background services:
kill "\$(cat $LOG_DIR/backend.pid)" "\$(cat $LOG_DIR/frontend.pid)" kill "\$(cat $LOG_DIR/backend.pid)" "\$(cat $LOG_DIR/frontend.pid)"
SUMMARY SUMMARY
if [[ -n "$FORCE_INSTALL_SERVICE" && "$PLATFORM" == "linux" ]]; then
info "Run at boot: systemd user units were installed and enabled. Start with: systemctl --user start openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker"
fi
} }
main "$@" main "$@"