Purple8 Database Schema Documentation
Overview
This document describes the PostgreSQL database schema for the Purple8 platform. It covers the observability tables (migration 001–002), document and memory tables (003–004), agent self-learning tables (005), and the sandbox session tables.
Table of Contents
- Agents Table
- Test Results Table
- Logs Table
- Traces Table
- Coverage Table
- Alerts Table
- Relationships
- Sample Queries
1. Agents Table
Purpose: Stores status and metrics for all agents (Core, QA, AI/ML, Enterprise) and infrastructure components.
Schema
| Column |
Type |
Nullable |
Default |
Description |
agent_id |
INTEGER |
NOT NULL |
AUTO |
Primary key (auto-increment) |
name |
VARCHAR(100) |
NOT NULL |
- |
Unique agent name (e.g., “API Gateway”, “Unit Testing”) |
category |
VARCHAR(50) |
NOT NULL |
- |
Agent category: core, qa, aiml, enterprise, infrastructure |
status |
VARCHAR(20) |
NOT NULL |
‘unknown’ |
Current status: healthy, warning, error, unknown |
uptime |
DOUBLE PRECISION |
NULL |
0 |
Uptime percentage (0-100) |
avg_response_time |
INTEGER |
NULL |
0 |
Average response time in milliseconds |
success_rate |
DOUBLE PRECISION |
NULL |
0 |
Success rate percentage (0-100) |
last_active |
TIMESTAMP |
NULL |
- |
Last activity timestamp |
metadata |
JSONB |
NULL |
- |
Additional agent-specific metadata |
created_at |
TIMESTAMP |
NULL |
NOW() |
Record creation timestamp |
updated_at |
TIMESTAMP |
NULL |
NOW() |
Last update timestamp |
Indexes
- Primary Key:
agent_id
- Unique:
name
- Indexes:
category, status, last_active DESC, (category, status)
Categories
- core (10 agents): Core development agents (API Gateway, Frontend, Code Generation, etc.)
- qa (9 agents): QA/Testing agents (Unit, Integration, Security, Performance, etc.)
- aiml (11 agents): AI/ML agents (Deep Learning, NLP, Computer Vision, etc.)
- enterprise (9 agents): Enterprise agents (Data Engineering, Compliance, Audit, etc.)
- infrastructure (4+ components): Infrastructure components (PostgreSQL, Redis, RabbitMQ, MinIO)
// Core agents
{"type": "service", "port": "8000"}
// QA agents
{"tests_run": 172, "pass_rate": 95.3, "coverage": 87.5}
// AI/ML agents
{"accuracy": 94.5, "inference_time": 250, "memory_mb": 512}
// Infrastructure
{"type": "database", "version": "15", "port": "5432"}
2. Test Results Table
Purpose: Stores detailed test execution results for all test types with comprehensive failure tracking.
Schema
| Column |
Type |
Nullable |
Default |
Description |
test_id |
INTEGER |
NOT NULL |
AUTO |
Primary key (auto-increment) |
test_type |
VARCHAR(50) |
NOT NULL |
- |
Type: unit, integration, security, performance, regression, uat, load, smoke, e2e |
status |
VARCHAR(20) |
NOT NULL |
- |
Status: completed, failed, running, cancelled |
timestamp |
TIMESTAMPTZ |
NULL |
CURRENT_TIMESTAMP |
Test execution timestamp |
total_tests |
INTEGER |
NOT NULL |
0 |
Total number of tests executed |
passed |
INTEGER |
NOT NULL |
0 |
Number of tests passed |
failed |
INTEGER |
NOT NULL |
0 |
Number of tests failed |
skipped |
INTEGER |
NOT NULL |
0 |
Number of tests skipped |
warnings |
INTEGER |
NOT NULL |
0 |
Number of warnings generated |
execution_time |
DOUBLE PRECISION |
NOT NULL |
0.0 |
Total execution time in seconds |
failed_tests |
JSONB |
NULL |
’[]’ |
Array of failed test details |
warning_details |
JSONB |
NULL |
’[]’ |
Array of warning details |
test_output |
TEXT |
NULL |
- |
Raw test output/logs |
triggered_by |
VARCHAR(100) |
NULL |
‘manual’ |
Who/what triggered the test |
environment |
VARCHAR(50) |
NULL |
‘local’ |
Environment: local, docker, ci, staging, prod |
branch_name |
VARCHAR(100) |
NULL |
- |
Git branch name |
commit_hash |
VARCHAR(40) |
NULL |
- |
Git commit hash |
Indexes
- Primary Key:
test_id
- Indexes:
timestamp DESC, test_type, status, (test_type, timestamp DESC)
Constraints
status must be: completed, failed, running, cancelled
test_type must be: unit, integration, security, performance, regression, uat, load, smoke, e2e
total_tests >= (passed + failed + skipped)
Failed Tests JSON Structure
[
{
"name": "test_api_authentication",
"error": "AssertionError: Expected 200, got 401",
"file": "tests/test_api.py",
"line": 45,
"stack_trace": "Traceback (most recent call last):\n File..."
}
]
Warning Details JSON Structure
[
{
"message": "DeprecationWarning: Function X is deprecated",
"file": "tests/test_utils.py",
"line": 23
}
]
3. Logs Table
Purpose: Centralized application logs from all services with structured metadata.
Schema
| Column |
Type |
Nullable |
Default |
Description |
log_id |
INTEGER |
NOT NULL |
AUTO |
Primary key (auto-increment) |
timestamp |
TIMESTAMP |
NOT NULL |
NOW() |
Log entry timestamp |
level |
VARCHAR(20) |
NOT NULL |
- |
Log level: DEBUG, INFO, WARNING, ERROR, CRITICAL |
service |
VARCHAR(100) |
NULL |
- |
Service name that generated the log |
message |
TEXT |
NOT NULL |
- |
Log message |
stack_trace |
TEXT |
NULL |
- |
Stack trace for errors |
request_id |
VARCHAR(100) |
NULL |
- |
Request ID for tracing |
user_id |
VARCHAR(100) |
NULL |
- |
User ID associated with the log |
metadata |
JSONB |
NULL |
- |
Additional structured metadata |
created_at |
TIMESTAMP |
NULL |
NOW() |
Record creation timestamp |
Indexes
- Primary Key:
log_id
- Indexes:
timestamp DESC, level, service, request_id
Log Levels
- DEBUG: Detailed diagnostic information
- INFO: General informational messages
- WARNING: Warning messages for potential issues
- ERROR: Error events that might still allow the application to continue
- CRITICAL: Critical events that may cause application failure
{
"method": "POST",
"endpoint": "/api/test",
"status_code": 500,
"duration_ms": 1250,
"ip_address": "192.168.1.100",
"user_agent": "Mozilla/5.0..."
}
4. Traces Table
Purpose: Distributed tracing data for request flow analysis across services.
Schema
| Column |
Type |
Nullable |
Default |
Description |
trace_id |
VARCHAR(100) |
NOT NULL |
- |
Unique trace identifier |
span_id |
VARCHAR(100) |
NOT NULL |
- |
Unique span identifier |
parent_span_id |
VARCHAR(100) |
NULL |
- |
Parent span ID (for nested spans) |
operation_name |
VARCHAR(200) |
NOT NULL |
- |
Operation/function name |
service_name |
VARCHAR(100) |
NOT NULL |
- |
Service that performed the operation |
start_time |
TIMESTAMP |
NOT NULL |
- |
Span start time |
end_time |
TIMESTAMP |
NULL |
- |
Span end time |
duration_ms |
INTEGER |
NULL |
- |
Duration in milliseconds |
status |
VARCHAR(20) |
NULL |
- |
Status: success, error, timeout |
tags |
JSONB |
NULL |
- |
Span tags (key-value pairs) |
logs |
JSONB |
NULL |
- |
Span logs/events |
created_at |
TIMESTAMP |
NULL |
NOW() |
Record creation timestamp |
Indexes
- Primary Key:
trace_id
- Indexes:
start_time DESC, service_name, operation_name, status
{
"http.method": "GET",
"http.url": "/api/users/123",
"http.status_code": 200,
"db.statement": "SELECT * FROM users WHERE id = ?",
"db.rows": 1,
"error": false
}
Logs Example
[
{
"timestamp": "2025-12-07T10:30:45.123Z",
"event": "cache_miss",
"message": "User not found in cache, querying database"
},
{
"timestamp": "2025-12-07T10:30:45.156Z",
"event": "db_query_complete",
"message": "Database query completed successfully"
}
]
5. Coverage Table
Purpose: Test coverage metrics by component and test type.
Schema
| Column |
Type |
Nullable |
Default |
Description |
coverage_id |
INTEGER |
NOT NULL |
AUTO |
Primary key (auto-increment) |
timestamp |
TIMESTAMP |
NOT NULL |
NOW() |
Coverage measurement timestamp |
component |
VARCHAR(100) |
NOT NULL |
- |
Component/module name |
coverage_percent |
DOUBLE PRECISION |
NOT NULL |
- |
Overall coverage percentage (0-100) |
lines_covered |
INTEGER |
NULL |
- |
Number of lines covered |
lines_total |
INTEGER |
NULL |
- |
Total number of lines |
branches_covered |
INTEGER |
NULL |
- |
Number of branches covered |
branches_total |
INTEGER |
NULL |
- |
Total number of branches |
test_type |
VARCHAR(50) |
NULL |
- |
Test type that generated coverage |
metadata |
JSONB |
NULL |
- |
Additional coverage metadata |
created_at |
TIMESTAMP |
NULL |
NOW() |
Record creation timestamp |
Indexes
- Primary Key:
coverage_id
- Indexes:
timestamp DESC, component, test_type
{
"functions_covered": 45,
"functions_total": 50,
"statements_covered": 234,
"statements_total": 267,
"complexity_avg": 3.2,
"file_path": "src/services/auth.py"
}
6. Alerts Table
Purpose: System alerts and notifications for monitoring and incident management.
Schema
| Column |
Type |
Nullable |
Default |
Description |
alert_id |
INTEGER |
NOT NULL |
AUTO |
Primary key (auto-increment) |
timestamp |
TIMESTAMP |
NOT NULL |
NOW() |
Alert creation timestamp |
severity |
VARCHAR(20) |
NOT NULL |
- |
Severity: info, warning, error, critical |
title |
VARCHAR(200) |
NOT NULL |
- |
Alert title/summary |
message |
TEXT |
NOT NULL |
- |
Detailed alert message |
source |
VARCHAR(100) |
NULL |
- |
Alert source (service, monitor, etc.) |
status |
VARCHAR(20) |
NULL |
‘active’ |
Status: active, acknowledged, resolved |
resolved_at |
TIMESTAMP |
NULL |
- |
Resolution timestamp |
metadata |
JSONB |
NULL |
- |
Additional alert metadata |
created_at |
TIMESTAMP |
NULL |
NOW() |
Record creation timestamp |
Indexes
- Primary Key:
alert_id
- Indexes:
timestamp DESC, severity, status, source
Severity Levels
- info: Informational alerts (FYI)
- warning: Potential issues requiring attention
- error: Errors affecting functionality
- critical: Critical issues requiring immediate action
{
"affected_services": ["api-gateway", "database"],
"metric": "cpu_usage",
"threshold": 90,
"current_value": 95.3,
"duration_seconds": 180,
"runbook_url": "https://wiki/runbooks/high-cpu",
"assigned_to": "oncall-team"
}
7. Relationships
Entity Relationship Diagram
┌─────────────┐
│ Agents │
│ (43 total) │
└─────────────┘
↓ (updates status)
┌──────────────────┐ ┌─────────────┐
│ Test Results │─────→│ Coverage │
│ (execution data) │ │ (metrics) │
└──────────────────┘ └─────────────┘
↓ (generates)
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Logs │←────→│ Traces │─────→│ Alerts │
│ (events) │ │ (spans) │ │ (incidents) │
└─────────────┘ └─────────────┘ └─────────────┘
Logical Relationships
- Test Results → Coverage: Test execution generates coverage data
- Test Results → Logs: Test failures create log entries
- Traces → Logs: Distributed traces reference log entries via
request_id
- Any Table → Alerts: Anomalies in any table can trigger alerts
- Agents → Test Results: QA agents track test execution metrics
8. Sample Queries
Get Agent Status Summary
SELECT
category,
COUNT(*) as total_agents,
SUM(CASE WHEN status = 'healthy' THEN 1 ELSE 0 END) as healthy,
SUM(CASE WHEN status = 'warning' THEN 1 ELSE 0 END) as warning,
SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) as error,
ROUND(AVG(success_rate), 2) as avg_success_rate
FROM agents
GROUP BY category
ORDER BY category;
Get Test Pass Rate Trend (Last 30 Days)
SELECT
DATE(timestamp) as test_date,
test_type,
COUNT(*) as runs,
ROUND(100.0 * SUM(passed) / NULLIF(SUM(total_tests), 0), 2) as pass_rate
FROM test_results
WHERE timestamp > NOW() - INTERVAL '30 days'
AND status = 'completed'
GROUP BY DATE(timestamp), test_type
ORDER BY test_date DESC, test_type;
Get Most Common Test Failures
SELECT
jsonb_array_elements(failed_tests)->>'name' as test_name,
COUNT(*) as failure_count,
jsonb_array_elements(failed_tests)->>'file' as test_file
FROM test_results
WHERE status = 'completed'
AND failed > 0
AND timestamp > NOW() - INTERVAL '7 days'
GROUP BY test_name, test_file
ORDER BY failure_count DESC
LIMIT 10;
Get Recent Error Logs with Context
SELECT
timestamp,
service,
message,
request_id,
metadata->>'endpoint' as endpoint,
metadata->>'status_code' as status_code
FROM logs
WHERE level IN ('ERROR', 'CRITICAL')
AND timestamp > NOW() - INTERVAL '1 hour'
ORDER BY timestamp DESC
LIMIT 50;
Get Slow Traces (> 1 second)
SELECT
trace_id,
service_name,
operation_name,
duration_ms,
tags->>'http.url' as url,
tags->>'http.method' as method
FROM traces
WHERE duration_ms > 1000
AND start_time > NOW() - INTERVAL '24 hours'
ORDER BY duration_ms DESC
LIMIT 20;
Get Coverage by Component
SELECT
component,
ROUND(AVG(coverage_percent), 2) as avg_coverage,
MAX(timestamp) as last_measured
FROM coverage
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY component
HAVING AVG(coverage_percent) < 80 -- Components below 80% coverage
ORDER BY avg_coverage ASC;
Get Active Critical Alerts
SELECT
alert_id,
title,
message,
source,
timestamp,
EXTRACT(EPOCH FROM (NOW() - timestamp))/60 as age_minutes,
metadata->>'affected_services' as affected_services
FROM alerts
WHERE status = 'active'
AND severity = 'critical'
ORDER BY timestamp DESC;
Get Test Execution Analytics
WITH test_stats AS (
SELECT
test_type,
COUNT(*) as total_runs,
ROUND(AVG(execution_time), 2) as avg_duration,
ROUND(100.0 * SUM(passed) / NULLIF(SUM(total_tests), 0), 2) as pass_rate,
SUM(failed) as total_failures
FROM test_results
WHERE timestamp > NOW() - INTERVAL '30 days'
AND status = 'completed'
GROUP BY test_type
)
SELECT * FROM test_stats
ORDER BY pass_rate ASC;
Data Retention Policy
Recommended Retention Periods
- Agents: Indefinite (configuration data)
- Test Results: 90 days (archive older data)
- Logs: 30 days (7 days for DEBUG level)
- Traces: 7 days (high volume data)
- Coverage: 90 days
- Alerts: 180 days (resolved alerts)
Archival Query Example
-- Archive old test results to separate table
INSERT INTO test_results_archive
SELECT * FROM test_results
WHERE timestamp < NOW() - INTERVAL '90 days';
DELETE FROM test_results
WHERE timestamp < NOW() - INTERVAL '90 days';
Current Index Coverage
- All tables have primary keys
- All timestamp columns indexed for time-series queries
- Foreign key relationships optimized with indexes
- Composite indexes for common query patterns
Query Optimization Tips
- Always filter by timestamp ranges
- Use indexes for WHERE, JOIN, and ORDER BY
- Use JSONB operators efficiently:
->, ->>, @>, ?
- Consider partitioning for high-volume tables (logs, traces)
- Use EXPLAIN ANALYZE to validate query plans
Migration History
| Migration |
File |
Contents |
| 001 |
001_create_observability_tables.sql |
Core observability schema (agents, logs, traces, alerts, coverage, test_results) |
| 002 |
002_populate_all_agents.sql |
Seed data — 58+ agents |
| 003 |
003_create_document_tables.sql |
Document storage for RAG and knowledge base |
| 004 |
004_create_memory_tables.sql |
Agent shared memory store |
| 005 |
005_create_agent_learning_tables.sql |
Self-learning: lessons, applications, stats, refinement runs |
Sandbox Session Schema
Sandbox session metadata is persisted to PostgreSQL for audit and recovery.
sandbox_sessions
| Column |
Type |
Description |
session_id |
UUID PK |
Unique session identifier |
user_id |
VARCHAR(100) |
Authenticated user who created the session |
project_id |
VARCHAR(255) |
Source generated project (nullable for blank sessions) |
status |
VARCHAR(20) |
creating, ready, running, terminated, error |
runtime_type |
VARCHAR(50) |
Primary runtime: python312, node20 |
containers |
JSONB |
Map of ContainerRole to ContainerInfo (id, ip, runtime, status) |
preview_port |
INTEGER |
Port exposed for live preview |
backend_port |
INTEGER |
Port the backend API listens on (nullable) |
pipeline_stage |
VARCHAR(100) |
Last completed pipeline stage |
pipeline_log |
TEXT |
Accumulated SSE log from launch pipeline |
created_at |
TIMESTAMP |
Session creation time |
last_active |
TIMESTAMP |
Last container activity |
terminated_at |
TIMESTAMP |
Session termination time (nullable) |
Agent Self-Learning Schema (Migration 005)
agent_lessons
| Column |
Type |
Description |
id |
SERIAL PK |
Auto-increment ID |
agent_name |
VARCHAR(100) |
Agent that owns this lesson |
lesson_text |
TEXT |
The lesson (injected into agent prompt) |
severity |
VARCHAR(20) |
low, medium, high, critical |
category |
VARCHAR(50) |
security, performance, code_quality, correctness |
source_issue_type |
VARCHAR(100) |
Type of QA issue that generated this lesson |
times_applied |
INTEGER |
How many times this lesson has been used |
effectiveness_score |
FLOAT |
Auto-calculated (0.0-1.0) via trigger |
is_active |
BOOLEAN |
Whether this lesson is currently injected |
expires_at |
TIMESTAMP |
Optional expiry (nullable) |
created_at |
TIMESTAMP |
When the lesson was learned |
updated_at |
TIMESTAMP |
Last update |
agent_lesson_applications
| Column |
Type |
Description |
id |
SERIAL PK |
|
lesson_id |
INTEGER FK |
References agent_lessons.id |
agent_name |
VARCHAR(100) |
Agent that applied the lesson |
pipeline_run_id |
VARCHAR(255) |
Pipeline run where lesson was applied |
project_name |
VARCHAR(255) |
Project name |
prevented_issue |
BOOLEAN |
Whether the lesson actually prevented the issue |
outcome_notes |
TEXT |
Optional notes on outcome |
outcome_verified_at |
TIMESTAMP |
When outcome was verified (nullable) |
agent_learning_stats
Auto-updated via database triggers.
| Column |
Type |
Description |
agent_name |
VARCHAR(100) PK |
|
total_lessons_learned |
INTEGER |
|
active_lessons |
INTEGER |
|
total_applications |
INTEGER |
|
successful_preventions |
INTEGER |
|
overall_effectiveness |
FLOAT |
successful_preventions / total_applications |
last_lesson_learned_at |
TIMESTAMP |
|
last_lesson_applied_at |
TIMESTAMP |
|
agent_refinement_runs
| Column |
Type |
Description |
id |
SERIAL PK |
|
triggered_by |
VARCHAR(100) |
rework, hitl, scheduled |
pipeline_run_id |
VARCHAR(255) |
|
issues_analyzed |
INTEGER |
|
lessons_generated |
INTEGER |
|
status |
VARCHAR(20) |
running, completed, failed |
started_at |
TIMESTAMP |
|
completed_at |
TIMESTAMP |
|
error_message |
TEXT |
Nullable |
For schema changes, run the appropriate migration file against the purple8 database. See the migrations/ directory for all SQL files.
Last Updated: February 2026
Schema Version: 2.0.0