Agent Performance Radar - Figma-ready (UI Specification)
Visão Geral
UI executiva para visualização e governança de performance de agentes, baseada em métricas do Agent Loop v1.
Frames (Telas)
Frame 01 — Agent Performance Radar (Overview)
Objetivo: "Como estão meus agentes agora?"
Layout (Desktop 1440x900):
1. Top Bar
- Title: "Agent Performance Radar"
- Org Selector: Dropdown com orgs disponíveis
- Date Range: Toggle buttons (7d / 30d / 90d) + Custom date picker
- Domain Filter: Multi-select (Sales / Pricing / Ops / Risk)
- Autonomy Filter: Multi-select (L0 / L1 / L2 / L3)
- Button: Export (CSV/JSON)
2. KPI Row (Cards)
6 cards horizontais:
- Total Runs: runs_total
- Success Rate: success_rate_pct %
- Avg Match Score: avg_match_score / 100
- Outcome Closure Rate: outcome_closure_rate_pct %
- Escalation Rate: escalation_rate_pct %
- Avg Latency: avg_latency_sec s
3. Radar + Leaderboard (Split 60/40)
Radar Chart (60%):
- Eixos:
- Effectiveness (delta_score normalizado)
- Reliability (success_rate)
- Governance (closure_rate)
- Precision (false_positive_rate invertida)
- Efficiency (latency invertida)
- Confidence (avg_confidence + match_score)
Leaderboard Table (40%):
Colunas:
- Agent (code + name)
- Domain (badge)
- Autonomy (badge L0-L3)
- Runs
- Success%
- AvgΔ (delta_score)
- Closure%
- Escalations%
- Trend (↑↓ ou ±X.X)
4. Breakdowns (Tabs)
- Tab A: "By Domain" (bar chart)
- Tab B: "By Policy" (heatmap matrix)
- Tab C: "By Entity" (table)
- Tab D: "By Channel" (se existir)
- Tab E: "By Autonomy Level" (line chart)
Frame 02 — Agent Detail (Deep Dive)
Objetivo: "Por que esse agente está bom/ruim?"
1. Header
- Agent name/code
- Domain badge
- Version
- Autonomy level badge
- Active toggle (kill-switch)
2. Health Strip
7 métricas em cards:
- Success%
- Closure%
- AvgΔ
- Avg match_score
- Avg confidence
- Escalation%
- Median latency
3. Time Series (4 gráficos)
- Runs/day (line)
- Success rate/day (line)
- Avg delta_score/day (line)
- Avg match_score/day (line)
4. Failure Analysis
- Top Failure Reasons: Table com
error_message+failure_reason+ count - Most Frequent Decision Codes: Table com decision_code + performance
5. Top Precedents Generated
Lista dos últimos precedents (mm_item) ligados ao agente:
- Item title
- Summary
- Created at
- Link para Decision Timeline
6. Actions
- Promote Autonomy: Button (L0→L1→L2→L3) se policy permitir
- Kill-switch: Toggle is_active
- Create Guardrail Policy: Shortcut
Frame 03 — Policy Impact Matrix
Objetivo: "Quais políticas esse agente toca e qual o impacto?"
UI:
- Matrix:
- Rows = policy_code
- Cols = outcome buckets (SUCCESS / FAILURE / OBSERVED / PENDING)
- Células:
- Count
- Avg delta_score
- Avg match_score
- Drill-down: Abre "Decision Timeline" filtrada
Frame 04 — Autonomy Gate Control
Objetivo: "Autonomia progressiva com segurança"
Tabela por agente:
- Current autonomy
- Suggested autonomy (computed)
- Requirements checklist (SLOs)
- Button "Apply promote" (se aprovado)
Rules Display (read-only):
- min_runs: 200
- min_success%: 90
- min_closure%: 95
- min_match_score: 90
- max_escalation%: 5
Componentes (Figma Components)
C.TopBar.Filters
- Org selector
- Date range toggle
- Domain filter
- Autonomy filter
- Export button
C.KPI.Card
- Label
- Value
- Unit
- Trend indicator (opcional)
C.RadarChart
- 6 eixos
- Score 0-100 por eixo
- Tooltip com detalhes
C.Table.Leaderboard
- Sortable columns
- Badges (Domain, Autonomy)
- Trend indicators
C.TabBar
- Tabs horizontais
- Active state
- Content area
C.LineChart
- X: date
- Y: metric value
- Multiple series (opcional)
C.BarChart
- X: category
- Y: value
- Grouped (opcional)
C.HeatmapMatrix
- Rows x Cols
- Color scale (0-100)
- Tooltip com detalhes
C.Badge.AutonomyLevel
- L0 (gray)
- L1 (blue)
- L2 (green)
- L3 (purple)
C.Badge.Domain
- SALES (orange)
- PRICING (green)
- OPS (blue)
- RISK (red)
C.Badge.Trend
- ↑ (green)
- ↓ (red)
- ±X.X (neutral)
C.Drawer.DecisionTimelineLink
- Opens Decision Timeline filtered
- Shows decision details
C.Modal.PromoteAutonomy
- Current level
- Target level
- Requirements checklist
- Confirm button
C.Toggle.Active
- On/Off state
- Kill-switch visual
APIs REST
GET /api/agents/performance-radar
Query:
- org_id (required)
- date_from (optional, YYYY-MM-DD)
- date_to (optional, YYYY-MM-DD)
- domain (optional)
- autonomy (optional)
- agent_code (optional)
Response:
{
"kpis": {
"runs_total": 1200,
"success_rate_pct": 91.4,
"avg_match_score": 92.5,
"outcome_closure_rate_pct": 96.2,
"escalation_rate_pct": 3.1,
"avg_latency_sec": 420.2
},
"radar_axes": [
{"axis": "Effectiveness", "score": 82.1},
{"axis": "Reliability", "score": 91.4},
{"axis": "Governance", "score": 96.2},
{"axis": "Efficiency", "score": 70.4},
{"axis": "Confidence", "score": 88.9},
{"axis": "Precision", "score": 85.0}
],
"leaderboard": [
{
"agent_code": "CSuite.Sales.Agent",
"agent_name": "Sales Agent",
"domain": "SALES",
"autonomy": "L1",
"runs": 320,
"success_pct": 93.2,
"avg_delta": 8.4,
"closure_pct": 97.0,
"escalation_pct": 2.1,
"trend": "+2.1"
}
]
}
GET /api/agents/{agent_code}/performance
Query:
- org_id (required)
- date_from (optional)
- date_to (optional)
Response: AgentPerfKpis
GET /api/agents/{agent_code}/policy-matrix
Query:
- org_id (required)
- date_from (optional)
- date_to (optional)
Response:
{
"matrix": [
{
"policy_code": "SALES_DISCOUNT_MAX",
"policy_name": "Desconto Máximo em Vendas",
"outcomes_count": 45,
"avg_match_score": 95.2,
"avg_delta_score": 7.8,
"success_count": 42,
"failure_count": 2,
"observed_count": 1,
"pending_count": 0,
"success_rate_pct": 93.3
}
]
}
GET /api/agents/autonomy/suggestions
Query:
- org_id (required)
- date_from (optional)
- date_to (optional)
Response:
[
{
"agent_code": "CSuite.Sales.Agent",
"agent_name": "Sales Agent",
"current_autonomy": "L1",
"suggested_autonomy": "L2",
"requirements_met": {
"min_runs": true,
"min_success": true,
"min_closure": true,
"min_match": true,
"max_escalation": true
},
"requirements": {
"min_runs": 200,
"min_success_pct": 90,
"min_closure_pct": 95,
"min_match_score": 90,
"max_escalation_pct": 5
},
"can_promote": true
}
]
POST /api/agents/{agent_code}/autonomy/promote
Body:
{
"org_id": 0,
"to_level": "L2",
"actor": "robsonrr",
"reason": "metrics passed"
}
POST /api/agents/{agent_code}/active
Body:
{
"org_id": 0,
"is_active": false,
"actor": "robsonrr",
"reason": "investigating drift"
}
GraphQL Schema
type AgentPerfKpis {
runs_total: Int!
success_rate_pct: Float!
avg_match_score: Float
outcome_closure_rate_pct: Float
escalation_rate_pct: Float!
avg_latency_sec: Float
}
type RadarAxisScore {
axis: String!
score: Float!
}
type AgentLeaderboardRow {
agent_code: String!
agent_name: String!
domain: String!
autonomy: String!
runs: Int!
success_pct: Float!
avg_delta: Float
closure_pct: Float
escalation_pct: Float!
trend: String
}
type AgentRadarOverview {
kpis: AgentPerfKpis!
radar_axes: [RadarAxisScore!]!
leaderboard: [AgentLeaderboardRow!]!
}
type Query {
agentPerformanceRadar(
org_id: Int!,
date_from: String!,
date_to: String!,
domain: String,
autonomy: String,
agent_code: String
): AgentRadarOverview!
agentPerformance(
org_id: Int!,
agent_code: String!,
date_from: String!,
date_to: String!
): AgentPerfKpis!
}
Autonomy Suggestion Rules
Thresholds Padrão
min_runs = 200(30d)success_rate_pct >= 90outcome_closure_rate_pct >= 95avg_match_score >= 90escalation_rate_pct <= 5avg_delta_scoredentro do target do domínio
Sugestão de Promoção
- L0→L1: Passou todos os thresholds
- L1→L2: + Estabilidade 2 semanas
- L2→L3: + Zero drift/alertas críticos 30d
Mapeamento de Dados
Views SQL
vw_agent_perf_rollup→ Overview KPIsvw_agent_policy_matrix→ Policy Impact Matrixvw_agent_perf_timeseries→ Time Series Chartsvw_agent_failure_analysis→ Failure Analysisvw_agent_precedents→ Precedents List
Procedures
sp_agent_perf_rollup→ Performance agregada (parametrizada)
Tabelas
agent_registry→ Agent metadataagent_runs→ Run executionspg_decision_log→ Decisionsexecution_action_ledger→ Actionsctx_policy_outcomes→ Outcomesmm_item→ Precedents
Próximos Passos
- Implementar UI React/Next.js baseada nos frames Figma
- Integrar com APIs REST criadas
- Adicionar GraphQL (opcional)
- Testar autonomia suggestions com dados reais
- Adicionar alertas quando métricas caem abaixo dos thresholds