Purple8-platform

Purple8 Database Schema Documentation

Overview

This document describes the PostgreSQL database schema for the Purple8 platform. It covers the observability tables (migration 001–002), document and memory tables (003–004), agent self-learning tables (005), and the sandbox session tables.


Table of Contents

  1. Agents Table
  2. Test Results Table
  3. Logs Table
  4. Traces Table
  5. Coverage Table
  6. Alerts Table
  7. Relationships
  8. Sample Queries

1. Agents Table

Purpose: Stores status and metrics for all agents (Core, QA, AI/ML, Enterprise) and infrastructure components.

Schema

Column Type Nullable Default Description
agent_id INTEGER NOT NULL AUTO Primary key (auto-increment)
name VARCHAR(100) NOT NULL - Unique agent name (e.g., “API Gateway”, “Unit Testing”)
category VARCHAR(50) NOT NULL - Agent category: core, qa, aiml, enterprise, infrastructure
status VARCHAR(20) NOT NULL ‘unknown’ Current status: healthy, warning, error, unknown
uptime DOUBLE PRECISION NULL 0 Uptime percentage (0-100)
avg_response_time INTEGER NULL 0 Average response time in milliseconds
success_rate DOUBLE PRECISION NULL 0 Success rate percentage (0-100)
last_active TIMESTAMP NULL - Last activity timestamp
metadata JSONB NULL - Additional agent-specific metadata
created_at TIMESTAMP NULL NOW() Record creation timestamp
updated_at TIMESTAMP NULL NOW() Last update timestamp

Indexes

Categories

Metadata Examples

// Core agents
{"type": "service", "port": "8000"}

// QA agents
{"tests_run": 172, "pass_rate": 95.3, "coverage": 87.5}

// AI/ML agents
{"accuracy": 94.5, "inference_time": 250, "memory_mb": 512}

// Infrastructure
{"type": "database", "version": "15", "port": "5432"}

2. Test Results Table

Purpose: Stores detailed test execution results for all test types with comprehensive failure tracking.

Schema

Column Type Nullable Default Description
test_id INTEGER NOT NULL AUTO Primary key (auto-increment)
test_type VARCHAR(50) NOT NULL - Type: unit, integration, security, performance, regression, uat, load, smoke, e2e
status VARCHAR(20) NOT NULL - Status: completed, failed, running, cancelled
timestamp TIMESTAMPTZ NULL CURRENT_TIMESTAMP Test execution timestamp
total_tests INTEGER NOT NULL 0 Total number of tests executed
passed INTEGER NOT NULL 0 Number of tests passed
failed INTEGER NOT NULL 0 Number of tests failed
skipped INTEGER NOT NULL 0 Number of tests skipped
warnings INTEGER NOT NULL 0 Number of warnings generated
execution_time DOUBLE PRECISION NOT NULL 0.0 Total execution time in seconds
failed_tests JSONB NULL ’[]’ Array of failed test details
warning_details JSONB NULL ’[]’ Array of warning details
test_output TEXT NULL - Raw test output/logs
triggered_by VARCHAR(100) NULL ‘manual’ Who/what triggered the test
environment VARCHAR(50) NULL ‘local’ Environment: local, docker, ci, staging, prod
branch_name VARCHAR(100) NULL - Git branch name
commit_hash VARCHAR(40) NULL - Git commit hash

Indexes

Constraints

Failed Tests JSON Structure

[
  {
    "name": "test_api_authentication",
    "error": "AssertionError: Expected 200, got 401",
    "file": "tests/test_api.py",
    "line": 45,
    "stack_trace": "Traceback (most recent call last):\n  File..."
  }
]

Warning Details JSON Structure

[
  {
    "message": "DeprecationWarning: Function X is deprecated",
    "file": "tests/test_utils.py",
    "line": 23
  }
]

3. Logs Table

Purpose: Centralized application logs from all services with structured metadata.

Schema

Column Type Nullable Default Description
log_id INTEGER NOT NULL AUTO Primary key (auto-increment)
timestamp TIMESTAMP NOT NULL NOW() Log entry timestamp
level VARCHAR(20) NOT NULL - Log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
service VARCHAR(100) NULL - Service name that generated the log
message TEXT NOT NULL - Log message
stack_trace TEXT NULL - Stack trace for errors
request_id VARCHAR(100) NULL - Request ID for tracing
user_id VARCHAR(100) NULL - User ID associated with the log
metadata JSONB NULL - Additional structured metadata
created_at TIMESTAMP NULL NOW() Record creation timestamp

Indexes

Log Levels

Metadata Example

{
  "method": "POST",
  "endpoint": "/api/test",
  "status_code": 500,
  "duration_ms": 1250,
  "ip_address": "192.168.1.100",
  "user_agent": "Mozilla/5.0..."
}

4. Traces Table

Purpose: Distributed tracing data for request flow analysis across services.

Schema

Column Type Nullable Default Description
trace_id VARCHAR(100) NOT NULL - Unique trace identifier
span_id VARCHAR(100) NOT NULL - Unique span identifier
parent_span_id VARCHAR(100) NULL - Parent span ID (for nested spans)
operation_name VARCHAR(200) NOT NULL - Operation/function name
service_name VARCHAR(100) NOT NULL - Service that performed the operation
start_time TIMESTAMP NOT NULL - Span start time
end_time TIMESTAMP NULL - Span end time
duration_ms INTEGER NULL - Duration in milliseconds
status VARCHAR(20) NULL - Status: success, error, timeout
tags JSONB NULL - Span tags (key-value pairs)
logs JSONB NULL - Span logs/events
created_at TIMESTAMP NULL NOW() Record creation timestamp

Indexes

Tags Example

{
  "http.method": "GET",
  "http.url": "/api/users/123",
  "http.status_code": 200,
  "db.statement": "SELECT * FROM users WHERE id = ?",
  "db.rows": 1,
  "error": false
}

Logs Example

[
  {
    "timestamp": "2025-12-07T10:30:45.123Z",
    "event": "cache_miss",
    "message": "User not found in cache, querying database"
  },
  {
    "timestamp": "2025-12-07T10:30:45.156Z",
    "event": "db_query_complete",
    "message": "Database query completed successfully"
  }
]

5. Coverage Table

Purpose: Test coverage metrics by component and test type.

Schema

Column Type Nullable Default Description
coverage_id INTEGER NOT NULL AUTO Primary key (auto-increment)
timestamp TIMESTAMP NOT NULL NOW() Coverage measurement timestamp
component VARCHAR(100) NOT NULL - Component/module name
coverage_percent DOUBLE PRECISION NOT NULL - Overall coverage percentage (0-100)
lines_covered INTEGER NULL - Number of lines covered
lines_total INTEGER NULL - Total number of lines
branches_covered INTEGER NULL - Number of branches covered
branches_total INTEGER NULL - Total number of branches
test_type VARCHAR(50) NULL - Test type that generated coverage
metadata JSONB NULL - Additional coverage metadata
created_at TIMESTAMP NULL NOW() Record creation timestamp

Indexes

Metadata Example

{
  "functions_covered": 45,
  "functions_total": 50,
  "statements_covered": 234,
  "statements_total": 267,
  "complexity_avg": 3.2,
  "file_path": "src/services/auth.py"
}

6. Alerts Table

Purpose: System alerts and notifications for monitoring and incident management.

Schema

Column Type Nullable Default Description
alert_id INTEGER NOT NULL AUTO Primary key (auto-increment)
timestamp TIMESTAMP NOT NULL NOW() Alert creation timestamp
severity VARCHAR(20) NOT NULL - Severity: info, warning, error, critical
title VARCHAR(200) NOT NULL - Alert title/summary
message TEXT NOT NULL - Detailed alert message
source VARCHAR(100) NULL - Alert source (service, monitor, etc.)
status VARCHAR(20) NULL ‘active’ Status: active, acknowledged, resolved
resolved_at TIMESTAMP NULL - Resolution timestamp
metadata JSONB NULL - Additional alert metadata
created_at TIMESTAMP NULL NOW() Record creation timestamp

Indexes

Severity Levels

Metadata Example

{
  "affected_services": ["api-gateway", "database"],
  "metric": "cpu_usage",
  "threshold": 90,
  "current_value": 95.3,
  "duration_seconds": 180,
  "runbook_url": "https://wiki/runbooks/high-cpu",
  "assigned_to": "oncall-team"
}

7. Relationships

Entity Relationship Diagram

┌─────────────┐
│   Agents    │
│ (43 total)  │
└─────────────┘
      ↓ (updates status)
      
┌──────────────────┐      ┌─────────────┐
│  Test Results    │─────→│  Coverage   │
│ (execution data) │      │ (metrics)   │
└──────────────────┘      └─────────────┘
      ↓ (generates)
      
┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│    Logs     │←────→│   Traces    │─────→│   Alerts    │
│ (events)    │      │ (spans)     │      │ (incidents) │
└─────────────┘      └─────────────┘      └─────────────┘

Logical Relationships

  1. Test Results → Coverage: Test execution generates coverage data
  2. Test Results → Logs: Test failures create log entries
  3. Traces → Logs: Distributed traces reference log entries via request_id
  4. Any Table → Alerts: Anomalies in any table can trigger alerts
  5. Agents → Test Results: QA agents track test execution metrics

8. Sample Queries

Get Agent Status Summary

SELECT 
    category,
    COUNT(*) as total_agents,
    SUM(CASE WHEN status = 'healthy' THEN 1 ELSE 0 END) as healthy,
    SUM(CASE WHEN status = 'warning' THEN 1 ELSE 0 END) as warning,
    SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) as error,
    ROUND(AVG(success_rate), 2) as avg_success_rate
FROM agents
GROUP BY category
ORDER BY category;

Get Test Pass Rate Trend (Last 30 Days)

SELECT 
    DATE(timestamp) as test_date,
    test_type,
    COUNT(*) as runs,
    ROUND(100.0 * SUM(passed) / NULLIF(SUM(total_tests), 0), 2) as pass_rate
FROM test_results
WHERE timestamp > NOW() - INTERVAL '30 days'
    AND status = 'completed'
GROUP BY DATE(timestamp), test_type
ORDER BY test_date DESC, test_type;

Get Most Common Test Failures

SELECT 
    jsonb_array_elements(failed_tests)->>'name' as test_name,
    COUNT(*) as failure_count,
    jsonb_array_elements(failed_tests)->>'file' as test_file
FROM test_results
WHERE status = 'completed' 
    AND failed > 0
    AND timestamp > NOW() - INTERVAL '7 days'
GROUP BY test_name, test_file
ORDER BY failure_count DESC
LIMIT 10;

Get Recent Error Logs with Context

SELECT 
    timestamp,
    service,
    message,
    request_id,
    metadata->>'endpoint' as endpoint,
    metadata->>'status_code' as status_code
FROM logs
WHERE level IN ('ERROR', 'CRITICAL')
    AND timestamp > NOW() - INTERVAL '1 hour'
ORDER BY timestamp DESC
LIMIT 50;

Get Slow Traces (> 1 second)

SELECT 
    trace_id,
    service_name,
    operation_name,
    duration_ms,
    tags->>'http.url' as url,
    tags->>'http.method' as method
FROM traces
WHERE duration_ms > 1000
    AND start_time > NOW() - INTERVAL '24 hours'
ORDER BY duration_ms DESC
LIMIT 20;

Get Coverage by Component

SELECT 
    component,
    ROUND(AVG(coverage_percent), 2) as avg_coverage,
    MAX(timestamp) as last_measured
FROM coverage
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY component
HAVING AVG(coverage_percent) < 80  -- Components below 80% coverage
ORDER BY avg_coverage ASC;

Get Active Critical Alerts

SELECT 
    alert_id,
    title,
    message,
    source,
    timestamp,
    EXTRACT(EPOCH FROM (NOW() - timestamp))/60 as age_minutes,
    metadata->>'affected_services' as affected_services
FROM alerts
WHERE status = 'active'
    AND severity = 'critical'
ORDER BY timestamp DESC;

Get Test Execution Analytics

WITH test_stats AS (
    SELECT 
        test_type,
        COUNT(*) as total_runs,
        ROUND(AVG(execution_time), 2) as avg_duration,
        ROUND(100.0 * SUM(passed) / NULLIF(SUM(total_tests), 0), 2) as pass_rate,
        SUM(failed) as total_failures
    FROM test_results
    WHERE timestamp > NOW() - INTERVAL '30 days'
        AND status = 'completed'
    GROUP BY test_type
)
SELECT * FROM test_stats
ORDER BY pass_rate ASC;

Data Retention Policy

Archival Query Example

-- Archive old test results to separate table
INSERT INTO test_results_archive
SELECT * FROM test_results
WHERE timestamp < NOW() - INTERVAL '90 days';

DELETE FROM test_results
WHERE timestamp < NOW() - INTERVAL '90 days';

Indexes and Performance

Current Index Coverage

Query Optimization Tips

  1. Always filter by timestamp ranges
  2. Use indexes for WHERE, JOIN, and ORDER BY
  3. Use JSONB operators efficiently: ->, ->>, @>, ?
  4. Consider partitioning for high-volume tables (logs, traces)
  5. Use EXPLAIN ANALYZE to validate query plans

Migration History

Migration File Contents
001 001_create_observability_tables.sql Core observability schema (agents, logs, traces, alerts, coverage, test_results)
002 002_populate_all_agents.sql Seed data — 58+ agents
003 003_create_document_tables.sql Document storage for RAG and knowledge base
004 004_create_memory_tables.sql Agent shared memory store
005 005_create_agent_learning_tables.sql Self-learning: lessons, applications, stats, refinement runs

Sandbox Session Schema

Sandbox session metadata is persisted to PostgreSQL for audit and recovery.

sandbox_sessions

Column Type Description
session_id UUID PK Unique session identifier
user_id VARCHAR(100) Authenticated user who created the session
project_id VARCHAR(255) Source generated project (nullable for blank sessions)
status VARCHAR(20) creating, ready, running, terminated, error
runtime_type VARCHAR(50) Primary runtime: python312, node20
containers JSONB Map of ContainerRole to ContainerInfo (id, ip, runtime, status)
preview_port INTEGER Port exposed for live preview
backend_port INTEGER Port the backend API listens on (nullable)
pipeline_stage VARCHAR(100) Last completed pipeline stage
pipeline_log TEXT Accumulated SSE log from launch pipeline
created_at TIMESTAMP Session creation time
last_active TIMESTAMP Last container activity
terminated_at TIMESTAMP Session termination time (nullable)

Agent Self-Learning Schema (Migration 005)

agent_lessons

Column Type Description
id SERIAL PK Auto-increment ID
agent_name VARCHAR(100) Agent that owns this lesson
lesson_text TEXT The lesson (injected into agent prompt)
severity VARCHAR(20) low, medium, high, critical
category VARCHAR(50) security, performance, code_quality, correctness
source_issue_type VARCHAR(100) Type of QA issue that generated this lesson
times_applied INTEGER How many times this lesson has been used
effectiveness_score FLOAT Auto-calculated (0.0-1.0) via trigger
is_active BOOLEAN Whether this lesson is currently injected
expires_at TIMESTAMP Optional expiry (nullable)
created_at TIMESTAMP When the lesson was learned
updated_at TIMESTAMP Last update

agent_lesson_applications

Column Type Description
id SERIAL PK  
lesson_id INTEGER FK References agent_lessons.id
agent_name VARCHAR(100) Agent that applied the lesson
pipeline_run_id VARCHAR(255) Pipeline run where lesson was applied
project_name VARCHAR(255) Project name
prevented_issue BOOLEAN Whether the lesson actually prevented the issue
outcome_notes TEXT Optional notes on outcome
outcome_verified_at TIMESTAMP When outcome was verified (nullable)

agent_learning_stats

Auto-updated via database triggers.

Column Type Description
agent_name VARCHAR(100) PK  
total_lessons_learned INTEGER  
active_lessons INTEGER  
total_applications INTEGER  
successful_preventions INTEGER  
overall_effectiveness FLOAT successful_preventions / total_applications
last_lesson_learned_at TIMESTAMP  
last_lesson_applied_at TIMESTAMP  

agent_refinement_runs

Column Type Description
id SERIAL PK  
triggered_by VARCHAR(100) rework, hitl, scheduled
pipeline_run_id VARCHAR(255)  
issues_analyzed INTEGER  
lessons_generated INTEGER  
status VARCHAR(20) running, completed, failed
started_at TIMESTAMP  
completed_at TIMESTAMP  
error_message TEXT Nullable

Contact & Support

For schema changes, run the appropriate migration file against the purple8 database. See the migrations/ directory for all SQL files.

Last Updated: February 2026 Schema Version: 2.0.0