Purple8-platform

Purple8 Database Schema Documentation

Overview

This document describes the PostgreSQL database schema for the Purple8 platform. It covers the observability tables (migration 001–002), document and memory tables (003–004), agent self-learning tables (005), and the sandbox session tables.

Agents Table
Test Results Table
Logs Table
Traces Table
Coverage Table
Alerts Table
Relationships
Sample Queries

1. Agents Table

Purpose: Stores status and metrics for all agents (Core, QA, AI/ML, Enterprise) and infrastructure components.

Schema

Column	Type	Nullable	Default	Description
`agent_id`	INTEGER	NOT NULL	AUTO	Primary key (auto-increment)
`name`	VARCHAR(100)	NOT NULL	-	Unique agent name (e.g., “API Gateway”, “Unit Testing”)
`category`	VARCHAR(50)	NOT NULL	-	Agent category: core, qa, aiml, enterprise, infrastructure
`status`	VARCHAR(20)	NOT NULL	‘unknown’	Current status: healthy, warning, error, unknown
`uptime`	DOUBLE PRECISION	NULL	0	Uptime percentage (0-100)
`avg_response_time`	INTEGER	NULL	0	Average response time in milliseconds
`success_rate`	DOUBLE PRECISION	NULL	0	Success rate percentage (0-100)
`last_active`	TIMESTAMP	NULL	-	Last activity timestamp
`metadata`	JSONB	NULL	-	Additional agent-specific metadata
`created_at`	TIMESTAMP	NULL	NOW()	Record creation timestamp
`updated_at`	TIMESTAMP	NULL	NOW()	Last update timestamp

Indexes

Primary Key: agent_id
Unique: name
Indexes: category, status, last_active DESC, (category, status)

Metadata Examples

// Core agents
{"type": "service", "port": "8000"}

// QA agents
{"tests_run": 172, "pass_rate": 95.3, "coverage": 87.5}

// AI/ML agents
{"accuracy": 94.5, "inference_time": 250, "memory_mb": 512}

// Infrastructure
{"type": "database", "version": "15", "port": "5432"}

2. Test Results Table

Purpose: Stores detailed test execution results for all test types with comprehensive failure tracking.

Schema

Column	Type	Nullable	Default	Description
`test_id`	INTEGER	NOT NULL	AUTO	Primary key (auto-increment)
`test_type`	VARCHAR(50)	NOT NULL	-	Type: unit, integration, security, performance, regression, uat, load, smoke, e2e
`status`	VARCHAR(20)	NOT NULL	-	Status: completed, failed, running, cancelled
`timestamp`	TIMESTAMPTZ	NULL	CURRENT_TIMESTAMP	Test execution timestamp
`total_tests`	INTEGER	NOT NULL	0	Total number of tests executed
`passed`	INTEGER	NOT NULL	0	Number of tests passed
`failed`	INTEGER	NOT NULL	0	Number of tests failed
`skipped`	INTEGER	NOT NULL	0	Number of tests skipped
`warnings`	INTEGER	NOT NULL	0	Number of warnings generated
`execution_time`	DOUBLE PRECISION	NOT NULL	0.0	Total execution time in seconds
`failed_tests`	JSONB	NULL	’[]’	Array of failed test details
`warning_details`	JSONB	NULL	’[]’	Array of warning details
`test_output`	TEXT	NULL	-	Raw test output/logs
`triggered_by`	VARCHAR(100)	NULL	‘manual’	Who/what triggered the test
`environment`	VARCHAR(50)	NULL	‘local’	Environment: local, docker, ci, staging, prod
`branch_name`	VARCHAR(100)	NULL	-	Git branch name
`commit_hash`	VARCHAR(40)	NULL	-	Git commit hash

Indexes

Primary Key: test_id
Indexes: timestamp DESC, test_type, status, (test_type, timestamp DESC)

Constraints

status must be: completed, failed, running, cancelled
test_type must be: unit, integration, security, performance, regression, uat, load, smoke, e2e
total_tests >= (passed + failed + skipped)

Failed Tests JSON Structure

[
  {
    "name": "test_api_authentication",
    "error": "AssertionError: Expected 200, got 401",
    "file": "tests/test_api.py",
    "line": 45,
    "stack_trace": "Traceback (most recent call last):\n  File..."
  }
]

Warning Details JSON Structure

[
  {
    "message": "DeprecationWarning: Function X is deprecated",
    "file": "tests/test_utils.py",
    "line": 23
  }
]

3. Logs Table

Purpose: Centralized application logs from all services with structured metadata.

Schema

Column	Type	Nullable	Default	Description
`log_id`	INTEGER	NOT NULL	AUTO	Primary key (auto-increment)
`timestamp`	TIMESTAMP	NOT NULL	NOW()	Log entry timestamp
`level`	VARCHAR(20)	NOT NULL	-	Log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
`service`	VARCHAR(100)	NULL	-	Service name that generated the log
`message`	TEXT	NOT NULL	-	Log message
`stack_trace`	TEXT	NULL	-	Stack trace for errors
`request_id`	VARCHAR(100)	NULL	-	Request ID for tracing
`user_id`	VARCHAR(100)	NULL	-	User ID associated with the log
`metadata`	JSONB	NULL	-	Additional structured metadata
`created_at`	TIMESTAMP	NULL	NOW()	Record creation timestamp

Indexes

Primary Key: log_id
Indexes: timestamp DESC, level, service, request_id

Log Levels

DEBUG: Detailed diagnostic information
INFO: General informational messages
WARNING: Warning messages for potential issues
ERROR: Error events that might still allow the application to continue
CRITICAL: Critical events that may cause application failure

Metadata Example

{
  "method": "POST",
  "endpoint": "/api/test",
  "status_code": 500,
  "duration_ms": 1250,
  "ip_address": "192.168.1.100",
  "user_agent": "Mozilla/5.0..."
}

4. Traces Table

Purpose: Distributed tracing data for request flow analysis across services.

Schema

Column	Type	Nullable	Default	Description
`trace_id`	VARCHAR(100)	NOT NULL	-	Unique trace identifier
`span_id`	VARCHAR(100)	NOT NULL	-	Unique span identifier
`parent_span_id`	VARCHAR(100)	NULL	-	Parent span ID (for nested spans)
`operation_name`	VARCHAR(200)	NOT NULL	-	Operation/function name
`service_name`	VARCHAR(100)	NOT NULL	-	Service that performed the operation
`start_time`	TIMESTAMP	NOT NULL	-	Span start time
`end_time`	TIMESTAMP	NULL	-	Span end time
`duration_ms`	INTEGER	NULL	-	Duration in milliseconds
`status`	VARCHAR(20)	NULL	-	Status: success, error, timeout
`tags`	JSONB	NULL	-	Span tags (key-value pairs)
`logs`	JSONB	NULL	-	Span logs/events
`created_at`	TIMESTAMP	NULL	NOW()	Record creation timestamp

Indexes

Primary Key: trace_id
Indexes: start_time DESC, service_name, operation_name, status

Tags Example

{
  "http.method": "GET",
  "http.url": "/api/users/123",
  "http.status_code": 200,
  "db.statement": "SELECT * FROM users WHERE id = ?",
  "db.rows": 1,
  "error": false
}

Logs Example

[
  {
    "timestamp": "2025-12-07T10:30:45.123Z",
    "event": "cache_miss",
    "message": "User not found in cache, querying database"
  },
  {
    "timestamp": "2025-12-07T10:30:45.156Z",
    "event": "db_query_complete",
    "message": "Database query completed successfully"
  }
]

5. Coverage Table

Purpose: Test coverage metrics by component and test type.

Schema

Column	Type	Nullable	Default	Description
`coverage_id`	INTEGER	NOT NULL	AUTO	Primary key (auto-increment)
`timestamp`	TIMESTAMP	NOT NULL	NOW()	Coverage measurement timestamp
`component`	VARCHAR(100)	NOT NULL	-	Component/module name
`coverage_percent`	DOUBLE PRECISION	NOT NULL	-	Overall coverage percentage (0-100)
`lines_covered`	INTEGER	NULL	-	Number of lines covered
`lines_total`	INTEGER	NULL	-	Total number of lines
`branches_covered`	INTEGER	NULL	-	Number of branches covered
`branches_total`	INTEGER	NULL	-	Total number of branches
`test_type`	VARCHAR(50)	NULL	-	Test type that generated coverage
`metadata`	JSONB	NULL	-	Additional coverage metadata
`created_at`	TIMESTAMP	NULL	NOW()	Record creation timestamp

Indexes

Primary Key: coverage_id
Indexes: timestamp DESC, component, test_type

Metadata Example

{
  "functions_covered": 45,
  "functions_total": 50,
  "statements_covered": 234,
  "statements_total": 267,
  "complexity_avg": 3.2,
  "file_path": "src/services/auth.py"
}

6. Alerts Table

Purpose: System alerts and notifications for monitoring and incident management.

Schema

Column	Type	Nullable	Default	Description
`alert_id`	INTEGER	NOT NULL	AUTO	Primary key (auto-increment)
`timestamp`	TIMESTAMP	NOT NULL	NOW()	Alert creation timestamp
`severity`	VARCHAR(20)	NOT NULL	-	Severity: info, warning, error, critical
`title`	VARCHAR(200)	NOT NULL	-	Alert title/summary
`message`	TEXT	NOT NULL	-	Detailed alert message
`source`	VARCHAR(100)	NULL	-	Alert source (service, monitor, etc.)
`status`	VARCHAR(20)	NULL	‘active’	Status: active, acknowledged, resolved
`resolved_at`	TIMESTAMP	NULL	-	Resolution timestamp
`metadata`	JSONB	NULL	-	Additional alert metadata
`created_at`	TIMESTAMP	NULL	NOW()	Record creation timestamp

Indexes

Primary Key: alert_id
Indexes: timestamp DESC, severity, status, source

Severity Levels

info: Informational alerts (FYI)
warning: Potential issues requiring attention
error: Errors affecting functionality
critical: Critical issues requiring immediate action

Metadata Example

{
  "affected_services": ["api-gateway", "database"],
  "metric": "cpu_usage",
  "threshold": 90,
  "current_value": 95.3,
  "duration_seconds": 180,
  "runbook_url": "https://wiki/runbooks/high-cpu",
  "assigned_to": "oncall-team"
}

7. Relationships

Entity Relationship Diagram

┌─────────────┐
│   Agents    │
│ (43 total)  │
└─────────────┘
      ↓ (updates status)
      
┌──────────────────┐      ┌─────────────┐
│  Test Results    │─────→│  Coverage   │
│ (execution data) │      │ (metrics)   │
└──────────────────┘      └─────────────┘
      ↓ (generates)
      
┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│    Logs     │←────→│   Traces    │─────→│   Alerts    │
│ (events)    │      │ (spans)     │      │ (incidents) │
└─────────────┘      └─────────────┘      └─────────────┘

Logical Relationships

Test Results → Coverage: Test execution generates coverage data
Test Results → Logs: Test failures create log entries
Traces → Logs: Distributed traces reference log entries via request_id
Any Table → Alerts: Anomalies in any table can trigger alerts
Agents → Test Results: QA agents track test execution metrics

8. Sample Queries

Get Agent Status Summary

SELECT 
    category,
    COUNT(*) as total_agents,
    SUM(CASE WHEN status = 'healthy' THEN 1 ELSE 0 END) as healthy,
    SUM(CASE WHEN status = 'warning' THEN 1 ELSE 0 END) as warning,
    SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) as error,
    ROUND(AVG(success_rate), 2) as avg_success_rate
FROM agents
GROUP BY category
ORDER BY category;

Get Test Pass Rate Trend (Last 30 Days)

SELECT 
    DATE(timestamp) as test_date,
    test_type,
    COUNT(*) as runs,
    ROUND(100.0 * SUM(passed) / NULLIF(SUM(total_tests), 0), 2) as pass_rate
FROM test_results
WHERE timestamp > NOW() - INTERVAL '30 days'
    AND status = 'completed'
GROUP BY DATE(timestamp), test_type
ORDER BY test_date DESC, test_type;

Get Most Common Test Failures

SELECT 
    jsonb_array_elements(failed_tests)->>'name' as test_name,
    COUNT(*) as failure_count,
    jsonb_array_elements(failed_tests)->>'file' as test_file
FROM test_results
WHERE status = 'completed' 
    AND failed > 0
    AND timestamp > NOW() - INTERVAL '7 days'
GROUP BY test_name, test_file
ORDER BY failure_count DESC
LIMIT 10;

Get Recent Error Logs with Context

SELECT 
    timestamp,
    service,
    message,
    request_id,
    metadata->>'endpoint' as endpoint,
    metadata->>'status_code' as status_code
FROM logs
WHERE level IN ('ERROR', 'CRITICAL')
    AND timestamp > NOW() - INTERVAL '1 hour'
ORDER BY timestamp DESC
LIMIT 50;

Get Slow Traces (> 1 second)

SELECT 
    trace_id,
    service_name,
    operation_name,
    duration_ms,
    tags->>'http.url' as url,
    tags->>'http.method' as method
FROM traces
WHERE duration_ms > 1000
    AND start_time > NOW() - INTERVAL '24 hours'
ORDER BY duration_ms DESC
LIMIT 20;

Get Coverage by Component

SELECT 
    component,
    ROUND(AVG(coverage_percent), 2) as avg_coverage,
    MAX(timestamp) as last_measured
FROM coverage
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY component
HAVING AVG(coverage_percent) < 80  -- Components below 80% coverage
ORDER BY avg_coverage ASC;

Get Active Critical Alerts

SELECT 
    alert_id,
    title,
    message,
    source,
    timestamp,
    EXTRACT(EPOCH FROM (NOW() - timestamp))/60 as age_minutes,
    metadata->>'affected_services' as affected_services
FROM alerts
WHERE status = 'active'
    AND severity = 'critical'
ORDER BY timestamp DESC;

Get Test Execution Analytics

WITH test_stats AS (
    SELECT 
        test_type,
        COUNT(*) as total_runs,
        ROUND(AVG(execution_time), 2) as avg_duration,
        ROUND(100.0 * SUM(passed) / NULLIF(SUM(total_tests), 0), 2) as pass_rate,
        SUM(failed) as total_failures
    FROM test_results
    WHERE timestamp > NOW() - INTERVAL '30 days'
        AND status = 'completed'
    GROUP BY test_type
)
SELECT * FROM test_stats
ORDER BY pass_rate ASC;

Data Retention Policy

Recommended Retention Periods

Agents: Indefinite (configuration data)
Test Results: 90 days (archive older data)
Logs: 30 days (7 days for DEBUG level)
Traces: 7 days (high volume data)
Coverage: 90 days
Alerts: 180 days (resolved alerts)

Archival Query Example

-- Archive old test results to separate table
INSERT INTO test_results_archive
SELECT * FROM test_results
WHERE timestamp < NOW() - INTERVAL '90 days';

DELETE FROM test_results
WHERE timestamp < NOW() - INTERVAL '90 days';

Indexes and Performance

Current Index Coverage

All tables have primary keys
All timestamp columns indexed for time-series queries
Foreign key relationships optimized with indexes
Composite indexes for common query patterns

Query Optimization Tips

Always filter by timestamp ranges
Use indexes for WHERE, JOIN, and ORDER BY
Use JSONB operators efficiently: ->, ->>, @>, ?
Consider partitioning for high-volume tables (logs, traces)
Use EXPLAIN ANALYZE to validate query plans

Migration History

Migration	File	Contents
001	`001_create_observability_tables.sql`	Core observability schema (agents, logs, traces, alerts, coverage, test_results)
002	`002_populate_all_agents.sql`	Seed data — 58+ agents
003	`003_create_document_tables.sql`	Document storage for RAG and knowledge base
004	`004_create_memory_tables.sql`	Agent shared memory store
005	`005_create_agent_learning_tables.sql`	Self-learning: lessons, applications, stats, refinement runs

Sandbox Session Schema

Sandbox session metadata is persisted to PostgreSQL for audit and recovery.

`sandbox_sessions`

Column	Type	Description
`session_id`	UUID PK	Unique session identifier
`user_id`	VARCHAR(100)	Authenticated user who created the session
`project_id`	VARCHAR(255)	Source generated project (nullable for blank sessions)
`status`	VARCHAR(20)	`creating`, `ready`, `running`, `terminated`, `error`
`runtime_type`	VARCHAR(50)	Primary runtime: `python312`, `node20`
`containers`	JSONB	Map of ContainerRole to ContainerInfo (id, ip, runtime, status)
`preview_port`	INTEGER	Port exposed for live preview
`backend_port`	INTEGER	Port the backend API listens on (nullable)
`pipeline_stage`	VARCHAR(100)	Last completed pipeline stage
`pipeline_log`	TEXT	Accumulated SSE log from launch pipeline
`created_at`	TIMESTAMP	Session creation time
`last_active`	TIMESTAMP	Last container activity
`terminated_at`	TIMESTAMP	Session termination time (nullable)

Agent Self-Learning Schema (Migration 005)

`agent_lessons`

Column	Type	Description
`id`	SERIAL PK	Auto-increment ID
`agent_name`	VARCHAR(100)	Agent that owns this lesson
`lesson_text`	TEXT	The lesson (injected into agent prompt)
`severity`	VARCHAR(20)	`low`, `medium`, `high`, `critical`
`category`	VARCHAR(50)	`security`, `performance`, `code_quality`, `correctness`
`source_issue_type`	VARCHAR(100)	Type of QA issue that generated this lesson
`times_applied`	INTEGER	How many times this lesson has been used
`effectiveness_score`	FLOAT	Auto-calculated (0.0-1.0) via trigger
`is_active`	BOOLEAN	Whether this lesson is currently injected
`expires_at`	TIMESTAMP	Optional expiry (nullable)
`created_at`	TIMESTAMP	When the lesson was learned
`updated_at`	TIMESTAMP	Last update

`agent_lesson_applications`

Column	Type	Description
`id`	SERIAL PK
`lesson_id`	INTEGER FK	References agent_lessons.id
`agent_name`	VARCHAR(100)	Agent that applied the lesson
`pipeline_run_id`	VARCHAR(255)	Pipeline run where lesson was applied
`project_name`	VARCHAR(255)	Project name
`prevented_issue`	BOOLEAN	Whether the lesson actually prevented the issue
`outcome_notes`	TEXT	Optional notes on outcome
`outcome_verified_at`	TIMESTAMP	When outcome was verified (nullable)

`agent_learning_stats`

Auto-updated via database triggers.

Column	Type	Description
`agent_name`	VARCHAR(100) PK
`total_lessons_learned`	INTEGER
`active_lessons`	INTEGER
`total_applications`	INTEGER
`successful_preventions`	INTEGER
`overall_effectiveness`	FLOAT	successful_preventions / total_applications
`last_lesson_learned_at`	TIMESTAMP
`last_lesson_applied_at`	TIMESTAMP

`agent_refinement_runs`

Column	Type	Description
`id`	SERIAL PK
`triggered_by`	VARCHAR(100)	`rework`, `hitl`, `scheduled`
`pipeline_run_id`	VARCHAR(255)
`issues_analyzed`	INTEGER
`lessons_generated`	INTEGER
`status`	VARCHAR(20)	`running`, `completed`, `failed`
`started_at`	TIMESTAMP
`completed_at`	TIMESTAMP
`error_message`	TEXT	Nullable

Contact & Support

For schema changes, run the appropriate migration file against the purple8 database. See the migrations/ directory for all SQL files.

Last Updated: February 2026 Schema Version: 2.0.0

Purple8-platform

Purple8 Database Schema Documentation

Overview

Table of Contents

1. Agents Table

Schema

Indexes

Categories

Metadata Examples

2. Test Results Table

Schema

Indexes

Constraints

Failed Tests JSON Structure

Warning Details JSON Structure

3. Logs Table

Schema

Indexes

Log Levels

Metadata Example

4. Traces Table

Schema

Indexes

Tags Example

Logs Example

5. Coverage Table

Schema

Indexes

Metadata Example

6. Alerts Table

Schema

Indexes

Severity Levels

Metadata Example

7. Relationships

Entity Relationship Diagram

Logical Relationships

8. Sample Queries

Get Agent Status Summary

Get Test Pass Rate Trend (Last 30 Days)

Get Most Common Test Failures

Get Recent Error Logs with Context

Get Slow Traces (> 1 second)

Get Coverage by Component

Get Active Critical Alerts

Get Test Execution Analytics

Data Retention Policy

Recommended Retention Periods

Archival Query Example

Indexes and Performance

Current Index Coverage

Query Optimization Tips

Migration History

Sandbox Session Schema

sandbox_sessions

Agent Self-Learning Schema (Migration 005)

agent_lessons

agent_lesson_applications

agent_learning_stats

agent_refinement_runs

Contact & Support

`sandbox_sessions`

`agent_lessons`

`agent_lesson_applications`

`agent_learning_stats`

`agent_refinement_runs`