|
|
|
|
@@ -115,31 +115,86 @@ data:
|
|
|
|
|
name: Voice Interaction
|
|
|
|
|
category: core
|
|
|
|
|
description: >
|
|
|
|
|
Speech-to-text on incoming voice notes and text-to-speech on
|
|
|
|
|
replies. Routed through PieCed's LiteLLM gateway so audio cost
|
|
|
|
|
is tracked per-tenant alongside chat.
|
|
|
|
|
# PHASE A: catalog entry only. No config_patch yet — toggling
|
|
|
|
|
# this package stores customer intent but does not change the
|
|
|
|
|
# OCI config. PHASE B (next iteration) wires in chatterbox-tts
|
|
|
|
|
# and a whisper adapter (or speaches-server) behind LiteLLM and
|
|
|
|
|
# adds the config_patch below, roughly:
|
|
|
|
|
Speech-to-text on incoming voice notes, automatic text-to-speech
|
|
|
|
|
on replies, and interactive Talk mode. Audio is routed through
|
|
|
|
|
PieCed's LiteLLM gateway so STT/TTS spend is tracked per-tenant
|
|
|
|
|
alongside chat usage. Inbound TTS uses kani-tts; Talk mode uses
|
|
|
|
|
kokoro-fastapi; STT uses self-hosted Whisper (faster-whisper-
|
|
|
|
|
large-v3). All three are private to the cluster.
|
|
|
|
|
# PHASE B — wired in. Toggling this package now installs the full
|
|
|
|
|
# voice surface into the tenant's OpenClawInstance config:
|
|
|
|
|
#
|
|
|
|
|
# config_patch:
|
|
|
|
|
# tools:
|
|
|
|
|
# media:
|
|
|
|
|
# audio:
|
|
|
|
|
# enabled: true
|
|
|
|
|
# models:
|
|
|
|
|
# - provider: openai
|
|
|
|
|
# model: pieced-whisper
|
|
|
|
|
# apiBase: http://litellm.inference.svc:4000/v1
|
|
|
|
|
# messages:
|
|
|
|
|
# tts:
|
|
|
|
|
# auto: inbound
|
|
|
|
|
# provider: openai
|
|
|
|
|
# openai:
|
|
|
|
|
# model: pieced-tts
|
|
|
|
|
# voice: nova
|
|
|
|
|
# - messages.tts: auto TTS on inbound replies, routed to
|
|
|
|
|
# `pieced-tts-inbound` (kani-tts behind LiteLLM).
|
|
|
|
|
# - talk: interactive Talk mode, routed to `pieced-tts-talk`
|
|
|
|
|
# (kokoro-fastapi behind LiteLLM). `interruptOnSpeech: true`
|
|
|
|
|
# so the agent stops talking when the user starts talking.
|
|
|
|
|
# - tools.media.audio: STT for inbound voice notes, capped at
|
|
|
|
|
# 20 MiB per message, routed to `pieced-stt` (whisper-openai
|
|
|
|
|
# behind LiteLLM).
|
|
|
|
|
#
|
|
|
|
|
# Provider config notes
|
|
|
|
|
# ---------------------
|
|
|
|
|
# `models.providers.openai` is declared here with no chat models
|
|
|
|
|
# (`models: []`) — its only role is to give the STT block under
|
|
|
|
|
# `tools.media.audio` a place to resolve its apiKey/baseUrl from.
|
|
|
|
|
# The agent's primary chat model still lives under
|
|
|
|
|
# `models.providers.litellm` (set in builder.go base) and is
|
|
|
|
|
# unaffected by this patch — deep-merge adds `openai` as a
|
|
|
|
|
# sibling provider, not a replacement.
|
|
|
|
|
#
|
|
|
|
|
# `allowPrivateNetwork: true` on the openai provider is required
|
|
|
|
|
# because the LiteLLM endpoint is a `http://*.svc` private-
|
|
|
|
|
# network address. Without it OpenClaw refuses the outbound
|
|
|
|
|
# call as a private-network destination.
|
|
|
|
|
#
|
|
|
|
|
# `${LITELLM_API_KEY}` is supplied via the tenant's envFrom
|
|
|
|
|
# secretRef (see builder.go), populated by ESO from
|
|
|
|
|
# secret/data/tenants/<ns>/litellm.
|
|
|
|
|
#
|
|
|
|
|
# Network policy
|
|
|
|
|
# --------------
|
|
|
|
|
# Audio traffic shares the existing LiteLLM egress hole in the
|
|
|
|
|
# per-tenant CiliumNetworkPolicy (toEndpoints inference ns,
|
|
|
|
|
# port 4000). No additional CNP rule needed — voice routes
|
|
|
|
|
# through the same gateway as chat completions.
|
|
|
|
|
config_patch:
|
|
|
|
|
models:
|
|
|
|
|
providers:
|
|
|
|
|
openai:
|
|
|
|
|
apiKey: "${LITELLM_API_KEY}"
|
|
|
|
|
baseUrl: "http://litellm.inference.svc:4000/v1"
|
|
|
|
|
models: []
|
|
|
|
|
request:
|
|
|
|
|
allowPrivateNetwork: true
|
|
|
|
|
messages:
|
|
|
|
|
tts:
|
|
|
|
|
auto: "inbound"
|
|
|
|
|
provider: "openai"
|
|
|
|
|
providers:
|
|
|
|
|
openai:
|
|
|
|
|
apiKey: "${LITELLM_API_KEY}"
|
|
|
|
|
baseUrl: "http://litellm.inference.svc:4000/v1"
|
|
|
|
|
model: "pieced-tts-inbound"
|
|
|
|
|
voice: "alloy"
|
|
|
|
|
talk:
|
|
|
|
|
provider: "openai"
|
|
|
|
|
providers:
|
|
|
|
|
openai:
|
|
|
|
|
apiKey: "${LITELLM_API_KEY}"
|
|
|
|
|
baseUrl: "http://litellm.inference.svc:4000/v1"
|
|
|
|
|
model: "pieced-tts-talk"
|
|
|
|
|
voice: "af_bella"
|
|
|
|
|
interruptOnSpeech: true
|
|
|
|
|
tools:
|
|
|
|
|
media:
|
|
|
|
|
audio:
|
|
|
|
|
enabled: true
|
|
|
|
|
maxBytes: 20971520
|
|
|
|
|
models:
|
|
|
|
|
- provider: "openai"
|
|
|
|
|
model: "pieced-stt"
|
|
|
|
|
baseUrl: "http://litellm.inference.svc:4000/v1"
|
|
|
|
|
|
|
|
|
|
# =====================================================================
|
|
|
|
|
# CHANNELS — messaging integrations. Each ships a Channels map that
|
|
|
|
|
@@ -201,16 +256,14 @@ data:
|
|
|
|
|
|
|
|
|
|
# =====================================================================
|
|
|
|
|
# SKILLS — ClawHub skill installs. Operator passes each entry through
|
|
|
|
|
# to OpenClawInstance.spec.skills, where the OpenClaw operator's init
|
|
|
|
|
# container fetches it before the agent starts. Bare "<owner>/<slug>"
|
|
|
|
|
# resolves through ClawHub by default.
|
|
|
|
|
# to spec.skills on the OpenClawInstance.
|
|
|
|
|
# =====================================================================
|
|
|
|
|
|
|
|
|
|
git-cli:
|
|
|
|
|
name: Git CLI
|
|
|
|
|
category: skill
|
|
|
|
|
description: >
|
|
|
|
|
Standalone git command-line operations (clone, commit, branch,
|
|
|
|
|
Use git from the assistant's shell (clone, commit, push, pull,
|
|
|
|
|
diff, log, status). For private repositories, configure
|
|
|
|
|
credentials in your workspace.
|
|
|
|
|
skills:
|
|
|
|
|
|