Docs_archive
Mesh v1 Redis bus
Source docs/MESH_V1_REDIS_BUS.md · All_docs
Short document for the "transport" Phase at the Hive scale when everything runs on the same OS (a VM, a host, a container): no NATS required to bootstrap v1.
Role
- A2A (
POST /api/a2a/jsonrpc,/.well-known/agent-card.json): request/response conversation and persisted tasks (Redis), caller identity = rule documented inGET /api/a2a/statusandA2A_INTEGRATION.md. - Redis pub/sub (existing channels in
app/src/lib/redis.ts): best-effort, fire-and-forget to signal state changes (agents, system) to subscribers on the same Redis instance.
Current channels (code)
Strict contract on emission: payloads go through Zod schemas in app/src/lib/mesh-events.ts; on failure, no PUBLISH is executed and a mesh.redis.publish_invalid_payload log is emitted (subscribers never see off-contract JSON from these helpers).
Each published message includes meshMeta: { v: 1, producer: "hive-core", eventId: <uuid>, correlationId? }. If MESH_BUS_HMAC_SECRET (env, ≥32 characters) is set, a meshSig field (hex 64) authenticates the pair (channel + core + meshMeta) — see meshHmacHex / verifyAgentStatusHmac / verifySystemEventHmac in app/src/lib/mesh-envelope.ts. Agent routes propagate X-Request-Id / X-Correlation-Id / traceparent to meshMeta.correlationId when present.
Delivery promise: best-effort (no application-level ack; if Redis is down, the event is lost unless retry logic exists elsewhere).
Production: prefer TLS to Redis (rediss:// URL) and ACL-restricted users so the bus and rate-limit keys are not readable on the wire or from unrelated workloads.
If Redis is unreachable: publishAgentStatus / publishSystemEvent log mesh.redis.publish_failed (warn) and do not fail the API request — explicit degradation, no silent hang. The ioredis client uses maxRetriesPerRequest: 3 and lazyConnect: true (app/src/lib/redis.ts).
Timeouts & retries (v1) — reference
Logical schema (v1)
flowchart LR
subgraph app [Next / Node]
A2A[A2A JSON-RPC]
API[API REST]
end
R[(Redis)]
A2A -->|tasks / rate limit| R
API -->|pub/sub| R
For agent-to-agent messages with strong semantics (at-least-once, persistence until ack) across sites or over the Internet, see MESH_V2_GLOBAL.md (WAN transport + trust). The v1 remains best-effort on local pub/sub.
Correlation with A2A
- JSON-RPC requests carry a JSON-RPC
id: to be used as a correlation id in the logs (mesh.a2a.rpc.requestin the app). - Protocol-side identity follows
User.userName(Hive user UUID, email, orhive-internalfor the machine token) — aligned with audit and per-peer rate limiting.
Reasonable v1 limits (indicative)
- Order of magnitude: tens of agents, hundreds of events/second on a local Redis — beyond that, scale Redis and consider a dedicated bus.