feat: run at boot (systemd/launchd) and auth token re-sync docs
- Add systemd unit examples and README for local install (Linux) - Extend deployment README with Run at boot (local install) and merge upstream - Add Re-syncing auth tokens subsection to gateway provisioning troubleshooting - install.sh: add --install-service to install systemd user units (Linux) - DOCUMENTATION.md: session notes Made-with: Cursor
This commit is contained in:
33
DOCUMENTATION.md
Normal file
33
DOCUMENTATION.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Session documentation
|
||||
|
||||
Decisions and changes made during development.
|
||||
|
||||
## 2026-03-09: Run at boot and auth token re-sync
|
||||
|
||||
### Goal
|
||||
|
||||
- Allow Mission Control (local install, no Docker, e.g. in a VM) to run at boot via systemd (Linux) or launchd (macOS).
|
||||
- Document how to re-sync auth tokens between Mission Control and OpenClaw when they have drifted.
|
||||
|
||||
### Implemented
|
||||
|
||||
1. **Systemd unit files** (`docs/deployment/systemd/`)
|
||||
- Added example units: `openclaw-mission-control-backend.service`, `openclaw-mission-control-frontend.service`, `openclaw-mission-control-rq-worker.service`.
|
||||
- Units use placeholders `REPO_ROOT`, `BACKEND_PORT`, `FRONTEND_PORT`; install instructions and a small README explain substitution and install to `~/.config/systemd/user/` or `/etc/systemd/system/`.
|
||||
- RQ worker is required for gateway lifecycle and webhooks; it is a separate unit.
|
||||
|
||||
2. **Deployment docs** (`docs/deployment/README.md`)
|
||||
- Replaced placeholder with a short deployment guide.
|
||||
- "Run at boot (local install)": Linux (systemd) with link to `systemd/README.md`; macOS (launchd) with example plist and `launchctl load`; Docker Compose note for `restart: unless-stopped`.
|
||||
|
||||
3. **Troubleshooting** (`docs/troubleshooting/gateway-agent-provisioning.md`)
|
||||
- New subsection "Re-syncing auth tokens when Mission Control and OpenClaw have drifted": when tokens drift, run template sync with `rotate_tokens=true` via API (curl) or CLI (`scripts/sync_gateway_templates.py --rotate-tokens`); after sync, wake/update gateway if needed.
|
||||
|
||||
4. **install.sh** (`install.sh`)
|
||||
- New optional flag `--install-service` (local mode only): on Linux, copies the three systemd unit files from `docs/deployment/systemd/`, substitutes `REPO_ROOT`/ports, installs to `$XDG_CONFIG_HOME/systemd/user` (or `~/.config/systemd/user`), runs `systemctl --user daemon-reload` and `systemctl --user enable`. On non-Linux, prints a note pointing to `docs/deployment/README.md` for launchd. Not prompted by default; only when the user passes `--install-service`.
|
||||
|
||||
### Rationale
|
||||
|
||||
- **No Docker in VM**: User runs local install in a VM and does not want Docker there; run-at-boot is provided by the OS (systemd/launchd).
|
||||
- **Units as examples**: Units are in `docs/deployment/systemd/` so they can be versioned and copied; install.sh only installs when `--install-service` is given to avoid touching system/LaunchAgents without explicit opt-in.
|
||||
- **Auth re-sync**: Token drift is a common failure mode; documenting the API and CLI with `rotate_tokens=true` in the provisioning troubleshooting doc makes recovery easy to find.
|
||||
@@ -50,6 +50,8 @@ Open:
|
||||
- Frontend: `http://localhost:${FRONTEND_PORT:-3000}`
|
||||
- Backend health: `http://localhost:${BACKEND_PORT:-8000}/healthz`
|
||||
|
||||
To have containers restart on failure and after host reboot, add `restart: unless-stopped` to the `db`, `redis`, `backend`, and `frontend` services in `compose.yml`, and ensure Docker is configured to start at boot.
|
||||
|
||||
### 3) Verify
|
||||
|
||||
```bash
|
||||
@@ -112,3 +114,65 @@ Typical setup (outline):
|
||||
- Ensure the frontend can reach the backend over the configured `NEXT_PUBLIC_API_URL`
|
||||
|
||||
This section is intentionally minimal until we standardize a recommended proxy (Caddy/Nginx/Traefik).
|
||||
|
||||
## Run at boot (local install)
|
||||
|
||||
If you installed Mission Control **without Docker** (e.g. using `install.sh` with "local" mode, or inside a VM where Docker is not used), the installer does not configure run-at-boot. You can start the stack after each reboot manually, or configure the OS to start it for you.
|
||||
|
||||
### Linux (systemd)
|
||||
|
||||
Use the example systemd units and instructions in [systemd/README.md](./systemd/README.md). In short:
|
||||
|
||||
1. Copy the unit files from `docs/deployment/systemd/` and replace `REPO_ROOT`, `BACKEND_PORT`, and `FRONTEND_PORT` with your paths and ports.
|
||||
2. Install the units under `~/.config/systemd/user/` (user) or `/etc/systemd/system/` (system).
|
||||
3. Enable and start the backend, frontend, and RQ worker services.
|
||||
|
||||
The RQ queue worker is required for gateway lifecycle (wake/check-in) and webhook delivery; run it as a separate unit.
|
||||
|
||||
### macOS (launchd)
|
||||
|
||||
Use LaunchAgents so the backend, frontend, and worker run under your user and restart on failure.
|
||||
|
||||
1. Create a plist for each process under `~/Library/LaunchAgents/`, e.g. `com.openclaw.mission-control.backend.plist`:
|
||||
|
||||
```xml
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
|
||||
<plist version="1.0">
|
||||
<dict>
|
||||
<key>Label</key>
|
||||
<string>com.openclaw.mission-control.backend</string>
|
||||
<key>ProgramArguments</key>
|
||||
<array>
|
||||
<string>/usr/bin/env</string>
|
||||
<string>uv</string>
|
||||
<string>run</string>
|
||||
<string>uvicorn</string>
|
||||
<string>app.main:app</string>
|
||||
<string>--host</string>
|
||||
<string>0.0.0.0</string>
|
||||
<string>--port</string>
|
||||
<string>8000</string>
|
||||
</array>
|
||||
<key>WorkingDirectory</key>
|
||||
<string>REPO_ROOT/backend</string>
|
||||
<key>EnvironmentVariables</key>
|
||||
<dict>
|
||||
<key>PATH</key>
|
||||
<string>/usr/local/bin:/opt/homebrew/bin:REPO_ROOT/backend/.venv/bin</string>
|
||||
</dict>
|
||||
<key>KeepAlive</key>
|
||||
<true/>
|
||||
<key>RunAtLoad</key>
|
||||
<true/>
|
||||
</dict>
|
||||
</plist>
|
||||
```
|
||||
|
||||
Replace `REPO_ROOT` with the actual repo path. Ensure `uv` is on `PATH` (e.g. add `~/.local/bin` to the `PATH` in the plist). Load with:
|
||||
|
||||
```bash
|
||||
launchctl load ~/Library/LaunchAgents/com.openclaw.mission-control.backend.plist
|
||||
```
|
||||
|
||||
2. Add similar plists for the frontend (`npm run start -- --hostname 0.0.0.0 --port 3000` in `REPO_ROOT/frontend`) and for the RQ worker (`uv run python ../scripts/rq worker` with `WorkingDirectory=REPO_ROOT/backend` and `ProgramArguments` pointing at `uv`, `run`, `python`, `../scripts/rq`, `worker`).
|
||||
|
||||
58
docs/deployment/systemd/README.md
Normal file
58
docs/deployment/systemd/README.md
Normal file
@@ -0,0 +1,58 @@
|
||||
# Systemd unit files (local install, run at boot)
|
||||
|
||||
Example systemd units for running Mission Control at boot when installed **without Docker** (e.g. local install in a VM).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- **Backend**: `uv`, Python 3.12+, and `backend/.env` configured (including `DATABASE_URL`, `RQ_REDIS_URL` if using the queue worker).
|
||||
- **Frontend**: Node.js 22+ and `frontend/.env` (e.g. `NEXT_PUBLIC_API_URL`).
|
||||
- **RQ worker**: Redis must be running and reachable; `backend/.env` must set `RQ_REDIS_URL` and `RQ_QUEUE_NAME` to match the backend API.
|
||||
|
||||
If you use Docker only for Postgres and/or Redis, start those first (e.g. `docker compose up -d db` and optionally Redis) or add `After=docker.service` and start the stack via a separate unit or script.
|
||||
|
||||
## Placeholders
|
||||
|
||||
Before installing, replace in each unit file:
|
||||
|
||||
- `REPO_ROOT` — absolute path to the Mission Control repo (e.g. `/home/user/openclaw-mission-control`).
|
||||
- `BACKEND_PORT` — backend port (default `8000`).
|
||||
- `FRONTEND_PORT` — frontend port (default `3000`).
|
||||
|
||||
Example (from repo root):
|
||||
|
||||
```bash
|
||||
REPO_ROOT="$(pwd)"
|
||||
for f in docs/deployment/systemd/openclaw-mission-control-*.service; do
|
||||
sed -e "s|REPO_ROOT|$REPO_ROOT|g" -e "s|BACKEND_PORT|8000|g" -e "s|FRONTEND_PORT|3000|g" "$f" \
|
||||
-o "$(basename "$f")"
|
||||
done
|
||||
# Then copy the generated .service files to ~/.config/systemd/user/ or /etc/systemd/system/
|
||||
```
|
||||
|
||||
## Install and enable
|
||||
|
||||
**User units** (recommended for single-user / VM):
|
||||
|
||||
```bash
|
||||
cp openclaw-mission-control-backend.service openclaw-mission-control-frontend.service openclaw-mission-control-rq-worker.service ~/.config/systemd/user/
|
||||
systemctl --user daemon-reload
|
||||
systemctl --user enable openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker
|
||||
systemctl --user start openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker
|
||||
```
|
||||
|
||||
**System-wide** (e.g. under `/etc/systemd/system/`):
|
||||
|
||||
```bash
|
||||
sudo cp openclaw-mission-control-*.service /etc/systemd/system/
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable --now openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker
|
||||
```
|
||||
|
||||
## Order
|
||||
|
||||
Start order is not strict between backend, frontend, and worker; all use `After=network-online.target`. Ensure Postgres (and Redis, if used) are running before or with the backend/worker (e.g. start Docker services first, or use system units for Postgres/Redis with the Mission Control units depending on them).
|
||||
|
||||
## Logs
|
||||
|
||||
- `journalctl --user -u openclaw-mission-control-backend -f` (or `sudo journalctl -u openclaw-mission-control-backend -f` for system units)
|
||||
- Same for `openclaw-mission-control-frontend` and `openclaw-mission-control-rq-worker`.
|
||||
@@ -0,0 +1,23 @@
|
||||
# Mission Control backend (FastAPI) — example systemd unit for local install.
|
||||
# Copy to ~/.config/systemd/user/ or /etc/systemd/system/, then:
|
||||
# sed -e 's|REPO_ROOT|/path/to/openclaw-mission-control|g' -e 's|BACKEND_PORT|8000|g' -i openclaw-mission-control-backend.service
|
||||
# systemctl --user daemon-reload # or sudo systemctl daemon-reload
|
||||
# systemctl --user enable --now openclaw-mission-control-backend # or sudo systemctl enable --now ...
|
||||
#
|
||||
# Requires: uv in PATH (e.g. ~/.local/bin), backend/.env present.
|
||||
|
||||
[Unit]
|
||||
Description=Mission Control backend (FastAPI)
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
WorkingDirectory=REPO_ROOT/backend
|
||||
EnvironmentFile=-REPO_ROOT/backend/.env
|
||||
ExecStart=uv run uvicorn app.main:app --host 0.0.0.0 --port BACKEND_PORT
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
@@ -0,0 +1,23 @@
|
||||
# Mission Control frontend (Next.js) — example systemd unit for local install.
|
||||
# Copy to ~/.config/systemd/user/ or /etc/systemd/system/, then:
|
||||
# sed -e 's|REPO_ROOT|/path/to/openclaw-mission-control|g' -e 's|FRONTEND_PORT|3000|g' -i openclaw-mission-control-frontend.service
|
||||
# systemctl --user daemon-reload # or sudo systemctl daemon-reload
|
||||
# systemctl --user enable --now openclaw-mission-control-frontend # or sudo systemctl enable --now ...
|
||||
#
|
||||
# Requires: Node.js/npm in PATH (e.g. from nvm or system install), frontend/.env present.
|
||||
|
||||
[Unit]
|
||||
Description=Mission Control frontend (Next.js)
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
WorkingDirectory=REPO_ROOT/frontend
|
||||
EnvironmentFile=-REPO_ROOT/frontend/.env
|
||||
ExecStart=npm run start -- --hostname 0.0.0.0 --port FRONTEND_PORT
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
@@ -0,0 +1,24 @@
|
||||
# Mission Control RQ queue worker — example systemd unit for local install.
|
||||
# Processes lifecycle and webhook queue tasks; required for gateway wake/check-in and webhooks.
|
||||
# Copy to ~/.config/systemd/user/ or /etc/systemd/system/, then:
|
||||
# sed -e 's|REPO_ROOT|/path/to/openclaw-mission-control|g' -i openclaw-mission-control-rq-worker.service
|
||||
# systemctl --user daemon-reload # or sudo systemctl daemon-reload
|
||||
# systemctl --user enable --now openclaw-mission-control-rq-worker # or sudo systemctl enable --now ...
|
||||
#
|
||||
# Requires: uv in PATH, Redis reachable (RQ_REDIS_URL in backend/.env), backend/.env present.
|
||||
|
||||
[Unit]
|
||||
Description=Mission Control RQ queue worker
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
WorkingDirectory=REPO_ROOT/backend
|
||||
EnvironmentFile=-REPO_ROOT/backend/.env
|
||||
ExecStart=uv run python ../scripts/rq worker
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
|
||||
[Install]
|
||||
WantedBy=default.target
|
||||
@@ -104,3 +104,29 @@ Actions:
|
||||
- gateway logs around bootstrap
|
||||
- worker logs around lifecycle events
|
||||
- agent `last_provision_error`, `wake_attempts`, `last_seen_at`
|
||||
|
||||
## Re-syncing auth tokens when Mission Control and OpenClaw have drifted
|
||||
|
||||
Mission Control stores a hash of each agent’s token and provisions OpenClaw by writing templates (e.g. `TOOLS.md`) that include `AUTH_TOKEN`. If the token on the gateway and the backend hash drift (e.g. after a reinstall, token change, or manual edit), heartbeats can fail with 401 and the agent may appear offline.
|
||||
|
||||
To re-sync:
|
||||
|
||||
1. Ensure Mission Control is running (API and queue worker).
|
||||
2. Run **template sync with token rotation** so the backend issues new agent tokens and rewrites `AUTH_TOKEN` into the gateway’s agent files.
|
||||
|
||||
**Via API (curl):**
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:8000/api/v1/gateways/GATEWAY_ID/templates/sync?rotate_tokens=true" \
|
||||
-H "Authorization: Bearer YOUR_LOCAL_AUTH_TOKEN"
|
||||
```
|
||||
|
||||
Replace `GATEWAY_ID` (from the Gateways list or gateway URL in the UI) and `YOUR_LOCAL_AUTH_TOKEN` with your local auth token.
|
||||
|
||||
**Via CLI (from repo root):**
|
||||
|
||||
```bash
|
||||
cd backend && uv run python scripts/sync_gateway_templates.py --gateway-id GATEWAY_ID --rotate-tokens
|
||||
```
|
||||
|
||||
After a successful sync, OpenClaw agents will have new `AUTH_TOKEN` values in their workspace files; the next heartbeat or bootstrap will use the new token. If the gateway was offline, trigger a wake/update from Mission Control so agents restart and pick up the new token.
|
||||
|
||||
52
install.sh
52
install.sh
@@ -30,6 +30,7 @@ FORCE_LOCAL_AUTH_TOKEN=""
|
||||
FORCE_DB_MODE=""
|
||||
FORCE_DATABASE_URL=""
|
||||
FORCE_START_SERVICES=""
|
||||
FORCE_INSTALL_SERVICE=""
|
||||
|
||||
if [[ -t 0 ]]; then
|
||||
INTERACTIVE=1
|
||||
@@ -131,6 +132,7 @@ Options:
|
||||
--db-mode <docker|external> Local mode only
|
||||
--database-url <url> Required when --db-mode external
|
||||
--start-services <yes|no> Local mode only
|
||||
--install-service Local mode only: install systemd user units for run at boot (Linux)
|
||||
-h, --help
|
||||
|
||||
If an option is omitted, the script prompts in interactive mode and uses defaults in non-interactive mode.
|
||||
@@ -220,6 +222,10 @@ parse_args() {
|
||||
FORCE_START_SERVICES="$2"
|
||||
shift 2
|
||||
;;
|
||||
--install-service)
|
||||
FORCE_INSTALL_SERVICE="yes"
|
||||
shift
|
||||
;;
|
||||
-h|--help)
|
||||
usage
|
||||
exit 0
|
||||
@@ -733,6 +739,45 @@ start_local_services() {
|
||||
)
|
||||
}
|
||||
|
||||
install_systemd_services() {
|
||||
local backend_port="$1"
|
||||
local frontend_port="$2"
|
||||
local systemd_user_dir
|
||||
systemd_user_dir="${XDG_CONFIG_HOME:-$HOME/.config}/systemd/user"
|
||||
local units_dir="$REPO_ROOT/docs/deployment/systemd"
|
||||
|
||||
if [[ "$PLATFORM" != "linux" ]]; then
|
||||
info "Skipping systemd install (not Linux). For macOS run-at-boot see docs/deployment/README.md (launchd)."
|
||||
return 0
|
||||
fi
|
||||
if [[ ! -d "$units_dir" ]]; then
|
||||
warn "Systemd units dir not found: $units_dir"
|
||||
return 1
|
||||
fi
|
||||
for name in openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker; do
|
||||
if [[ ! -f "$units_dir/$name.service" ]]; then
|
||||
warn "Unit file not found: $units_dir/$name.service"
|
||||
return 1
|
||||
fi
|
||||
done
|
||||
|
||||
mkdir -p "$systemd_user_dir"
|
||||
for name in openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker; do
|
||||
sed -e "s|REPO_ROOT|$REPO_ROOT|g" \
|
||||
-e "s|BACKEND_PORT|$backend_port|g" \
|
||||
-e "s|FRONTEND_PORT|$frontend_port|g" \
|
||||
"$units_dir/$name.service" > "$systemd_user_dir/$name.service"
|
||||
info "Installed $systemd_user_dir/$name.service"
|
||||
done
|
||||
if command_exists systemctl; then
|
||||
systemctl --user daemon-reload
|
||||
systemctl --user enable openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker
|
||||
info "Systemd user units enabled. Start with: systemctl --user start openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker"
|
||||
else
|
||||
warn "systemctl not found; units were copied but not enabled."
|
||||
fi
|
||||
}
|
||||
|
||||
ensure_repo_layout() {
|
||||
[[ -f "$REPO_ROOT/Makefile" ]] || die "Missing Makefile in expected repository root: $REPO_ROOT"
|
||||
[[ -f "$REPO_ROOT/compose.yml" ]] || die "Missing compose.yml in expected repository root: $REPO_ROOT"
|
||||
@@ -954,6 +999,10 @@ SUMMARY
|
||||
wait_for_http "http://127.0.0.1:$frontend_port" "Frontend" 120 || true
|
||||
fi
|
||||
|
||||
if [[ -n "$FORCE_INSTALL_SERVICE" ]]; then
|
||||
install_systemd_services "$backend_port" "$frontend_port" || true
|
||||
fi
|
||||
|
||||
cat <<SUMMARY
|
||||
|
||||
Bootstrap complete (Local mode).
|
||||
@@ -973,6 +1022,9 @@ If services were started by this script, logs are under:
|
||||
Stop local background services:
|
||||
kill "\$(cat $LOG_DIR/backend.pid)" "\$(cat $LOG_DIR/frontend.pid)"
|
||||
SUMMARY
|
||||
if [[ -n "$FORCE_INSTALL_SERVICE" && "$PLATFORM" == "linux" ]]; then
|
||||
info "Run at boot: systemd user units were installed and enabled. Start with: systemctl --user start openclaw-mission-control-backend openclaw-mission-control-frontend openclaw-mission-control-rq-worker"
|
||||
fi
|
||||
}
|
||||
|
||||
main "$@"
|
||||
|
||||
Reference in New Issue
Block a user