Commit Graph

1031 Commits

Author SHA1 Message Date
Hugh Brown
b4bbe1c657 fix: chain Alembic migrations to avoid multiple heads
Set a9b1c2d3e4f7.down_revision = "a1b2c3d4e5f6" so the activity_events
migration depends on the webhook_secret migration, creating a linear
chain instead of two heads from the same parent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
0a749db2c4 Remove unused imports 2026-03-07 23:35:10 +05:30
Hugh Brown
3f333e1592 Add isort fix 2026-03-07 23:35:10 +05:30
Hugh Brown
dc25a9df6b fix: fail fast when RATE_LIMIT_BACKEND=redis but no Redis URL is configured
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
f1bcf72810 feat: add trusted client-IP extraction from proxy headers
Add get_client_ip() helper that inspects Forwarded and X-Forwarded-For
headers only when the direct peer is in TRUSTED_PROXIES (comma-separated
IPs/CIDRs). Replaces raw request.client.host in rate-limit and webhook
source_ip to prevent all traffic collapsing behind a reverse proxy IP.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
24e40f1153 docs: update operations README for configurable rate-limit backend
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
494bf4649e docs: update api.md and authentication.md for Redis rate-limit backend and token logging
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
72241d6870 Correct import order 2026-03-07 23:35:10 +05:30
Hugh Brown
81d16a324b docs: update security.md for Redis rate-limit backend and token logging
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
77f73872eb fix: scope optional agent auth rate limiting to X-Agent-Token header only
Prevents normal user requests with Authorization: Bearer from being
throttled by the agent auth limiter in the shared require_user_or_agent
dependency path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
ac69c6b7b8 fix: normalize and validate signature_header in webhook schemas
Strip whitespace (blank → None) and reject non-ASCII-token characters
to prevent impossible header lookups that would fail all signed requests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
fe310b50dc Apply black fixes 2026-03-07 23:35:10 +05:30
Hugh Brown
fc9fc1661c feat: add Redis-backed rate limiter with configurable backend
Add RedisRateLimiter using sorted-set sliding window alongside the
existing InMemoryRateLimiter. Users choose via RATE_LIMIT_BACKEND
(memory|redis) with RATE_LIMIT_REDIS_URL falling back to RQ_REDIS_URL.
Redis backend validates connectivity at startup and fails open on
transient errors during requests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
ee825fb2d5 docs: fix gateway token description in openclaw_gateway_ws.md to match actual behavior
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
528a2483b7 feat: add configurable signature_header for webhook HMAC verification
Not all webhook providers use X-Hub-Signature-256 or X-Webhook-Signature.
Add an optional signature_header field so users can specify which header
carries the HMAC signature. When set, that exact header is checked;
when unset, the existing auto-detect fallback is preserved. The custom
header is also excluded from stored/exposed payload headers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
ce18fe4f0c fix: apply rate limiting to optional agent auth path
get_agent_auth_context_optional was not rate-limited, allowing
brute-force token guessing via routes that use require_user_or_agent.
Now applies agent_auth_limiter when a token is actually presented.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
84a5d8677e docs: update security.md to reflect current gateway token behavior
The has_token redaction was reverted to avoid a frontend breaking
change. Update docs to match: tokens are currently returned in API
responses, redaction is planned for a future PR. Also note the
configurable payload size limit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
a66765a514 Apply ruff fixes 2026-03-07 23:35:10 +05:30
Hugh Brown
0896b0772d Import linting 2026-03-07 23:35:10 +05:30
Hugh Brown
ebe148e537 fix: use Alpine-compatible flags for addgroup/adduser in frontend Dockerfile
node:20-alpine uses BusyBox which does not support GNU-style
--system/--ingroup flags. Switch to -S/-G equivalents.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
433021b02c fix: use Annotated+BeforeValidator for webhook secret normalization
The previous field_validator approach passed `cls` as the first argument
to `_normalize_secret`, which only accepted `v`, causing a TypeError at
runtime. Switch to `Annotated[str | None, BeforeValidator(...)]` which
calls the function with just the value and also eliminates the repeated
validator assignment in both schema classes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
2ef6164cf8 fix: normalize webhook secret via schema validator instead of inline
Move blank/whitespace-only secret normalization to a shared
field_validator on both BoardWebhookCreate and BoardWebhookUpdate.
This ensures consistent behavior across create and update paths
and removes the inline normalization from the endpoint handlers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
91e8270364 revert: restore GatewayRead.token field to avoid frontend breaking change
The has_token boolean redaction requires coordinated frontend changes
(detail page, edit page, orval types). Revert to returning the raw
token for now; token redaction will be handled in a dedicated PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
54279bf413 revert: restore truncated token_prefix in agent auth log messages
A 6-character prefix of the token is standard practice for debugging
failed auth attempts and is not a security risk. Restored in both
required and optional auth paths, and removed the now-incorrect test
that asserted its absence.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
b2fb8a082d feat: make webhook payload size limit configurable
Add WEBHOOK_MAX_PAYLOAD_BYTES setting (default 1 MB) so deployments
with larger webhook payloads can raise the limit via environment
variable instead of being hard-blocked at 1 MB.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
acd1526acf docs: update api.md to reflect require_user_or_agent rename
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
af094ad11a fix: exclude signature and auth headers from webhook payload capture
_captured_headers was storing all x-* headers including
X-Hub-Signature-256 and X-Webhook-Signature. Since stored headers
are exposed via the payload read endpoint, this enabled replay
attacks without knowing the webhook secret. Now signature and
authorization headers are excluded from capture.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
fcbde9b0e1 test: remove duplicate rate limiter tests from test_security_fixes
These two tests were exact subsets of the dedicated test_rate_limit.py
suite. Consolidating to a single file avoids maintenance drift.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
cc50877131 refactor: rename require_admin_auth/require_admin_or_agent to require_user_auth/require_user_or_agent
These dependencies check actor type (human user vs agent), not admin
privilege. The old names were misleading and could cause authorization
mistakes when wiring new endpoints. Renamed across all 10 consumer
files along with their local ADMIN_AUTH_DEP / ADMIN_OR_AGENT_DEP
aliases.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
ea78b41a36 fix: persist webhook secret on create and normalize on update
The secret field was accepted in BoardWebhookCreate but never passed
to the BoardWebhook constructor, silently dropping it. Now secret is
persisted at creation time (with empty/whitespace normalized to None)
and similarly normalized on PATCH so sending "" clears a set secret.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
3a00636ceb Update backend/migrations/versions/a1b2c3d4e5f6_add_webhook_secret.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
3ca0931c72 Update backend/app/api/board_webhooks.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
cd7e411b3e Update backend/app/api/skills_marketplace.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
cd70242043 docs: fix rate limiter docstrings to reflect sliding-window algorithm
The module and class docstrings incorrectly described the implementation
as a "token-bucket" limiter when it actually uses a sliding-window log
(deque of timestamps with pruning).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
3a0c67a656 Update backend/app/api/board_webhooks.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
86229038eb Update backend/tests/test_security_fixes.py
Seems like a simpler fix.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
149fde90c4 docs: document security hardening changes from security review
Add documentation for all user/operator-facing changes introduced by the
security review branch: rate limits, security headers, webhook HMAC
verification, payload size limits, gateway token redaction, non-root
containers, agent token logging, and prompt injection mitigation.

Updated: docs/reference/api.md, docs/reference/authentication.md,
docs/reference/configuration.md, docs/deployment/README.md,
docs/operations/README.md, docs/openclaw_gateway_ws.md, backend/README.md.
Created: docs/reference/security.md (consolidated security reference).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
916dace3c8 Address ruff / formatting errors 2026-03-07 23:35:10 +05:30
Hugh Brown
62d2378bdc chore: simplify and harden security review changes
- Add prompt-injection fencing to _webhook_memory_content (was missing
  the --- BEGIN/END EXTERNAL DATA --- fence applied elsewhere)
- Wrap Content-Length parsing in try/except to avoid 500 on malformed
  header values
- Move _to_gateway_read below imports (was incorrectly placed between
  import blocks) and tighten transformer types
- Replace list-rebuild with deque.popleft in rate limiter for O(expired)
  amortized pruning instead of O(n) per call
- Make organization_id required in send_session_message to prevent
  fail-open cross-tenant check

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
4960d8561b security: fix fail-open auth, streaming payload limit, and rate limiter memory leak
- agent.py: Fail closed when gateway lookup returns None instead of
  silently dropping the organization filter (cross-tenant board leak)
- board_webhooks.py: Read request body via streaming chunks so an
  oversized payload is rejected before it is fully loaded into memory
- rate_limit.py: Add periodic sweep of expired keys to prevent
  unbounded memory growth from inactive clients
- test_rate_limit.py: Add test for the new sweep behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
858575cf6c test: add comprehensive tests for all security fixes
Add 20 tests covering:
- require_user_actor: rejects agents and null users, passes valid users
- Webhook HMAC: rejects missing/invalid signatures, accepts valid ones,
  allows unsigned when no secret configured
- Prompt injection: sanitized skill name/URL, fenced external data in
  dispatch messages, system instructions precede data
- Security headers: verify nosniff, DENY, referrer-policy defaults
- Payload size: rejects oversized body and content-length
- Rate limiting: blocks after threshold, independent per-key
- Gateway token: has_token field present, token field absent
- Agent auth logs: no token_prefix in source

Also fix deprecated HTTP_413_REQUEST_ENTITY_TOO_LARGE status code.

All 407 tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
547965a5cb security: redact gateway tokens from API responses
Gateway tokens were returned as plaintext in GatewayRead API responses.
Replace the `token` field with a boolean `has_token` flag so the API
never exposes the plaintext token. The token remains in the database
for outbound gateway connections (full encryption would require key
management infrastructure).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
94988deef2 security: add rate limiting to agent auth and webhook ingest
Agent token auth performed O(n) PBKDF2 operations per request with no
rate limiting, enabling CPU exhaustion attacks. Webhook ingest had no
rate limits either. Add an in-memory token-bucket rate limiter:
- Agent auth: 20 requests/minute per IP
- Webhook ingest: 60 requests/minute per IP

Includes unit tests for the rate limiter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
8a30c82c6d security: remove hardcoded auth token from committed .env.test
The file contained a publicly known LOCAL_AUTH_TOKEN value that could
be used against misconfigured deployments. Replace with an empty value
and a comment showing how to generate a secure token. The test suite
continues to work via conftest.py which sets its own test-only token.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
8e145a2129 security: set sensible defaults for security response headers
X-Content-Type-Options, X-Frame-Options, and Referrer-Policy all
defaulted to empty (disabled). Set defaults to nosniff, DENY, and
strict-origin-when-cross-origin respectively. Operators can still
override or disable via environment variables.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
c7f8578f38 security: run Docker containers as non-root user
Both backend and frontend Dockerfiles ran all processes as root.
Add a dedicated appuser in each runtime stage so container processes
run with minimal privileges, limiting blast radius of any container
escape.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
4257c08ba9 security: stop logging token prefix on failed agent auth
The first 6 characters of invalid agent tokens were logged, leaking
partial credential information. Remove token_prefix from log messages
while preserving the request path for debugging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
66da278673 security: require org-admin for gateway session message endpoint
send_gateway_session_message only required basic auth (AUTH_DEP) while
all other gateway endpoints required ORG_ADMIN_DEP. Any authenticated
user could send messages to any gateway session. Now requires org-admin
and verifies the board belongs to the caller's organization.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
7ca4145aff security: add 1 MB payload size limit to webhook ingestion
The webhook ingest endpoint read the entire request body with no size
limit, enabling memory exhaustion attacks. Add a 1 MB limit checked
via both Content-Length header (early reject) and actual body size.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30
Hugh Brown
5d382ed67b security: mitigate prompt injection in agent instruction strings
User-controlled fields (skill name, source URL, webhook payloads) were
interpolated directly into agent instruction messages. Sanitize skill
fields by stripping newlines/control chars, and fence all external data
behind "BEGIN EXTERNAL DATA" / "BEGIN STRUCTURED DATA" delimiters with
explicit "do not interpret as instructions" markers. Move system
instructions above the data section so they cannot be overridden.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:35:10 +05:30