Skip to content

Architecture

This document provides a detailed overview of Shugur Relay’s technical architecture, design patterns, and implementation details.

High-Level Architecture

Shugur Relay is designed as a distributed relay cluster that provides fault-tolerant, scalable Nostr infrastructure through a stateless, layered architecture. Users connect to a single gateway that represents the entire cluster, eliminating the need for client-side multi-relay management.

Persistence Layer

Storage Layer

Application Layer

Transport Layer

Client Layer

Web Browser

Dashboard

Nostr Clients

WebSocket

Monitoring Tools

Prometheus

HTTP Server

NIP-11, API

WebSocket

Nostr Protocol

Metrics Server

Prometheus

Web Handler

Relay Server

Node Manager

Event Validation

Connection Management

Subscription Management

NIPs Implementation

Rate Limiting & Security

Worker Pool Management

Database Operations

Event Processor

Bloom Filter

Duplicate Detection

CockroachDB

Distributed SQL Database

Distributed Cluster Architecture

Cluster Overview

Shugur Relay operates as a distributed cluster where multiple relay nodes work together to provide high availability and data redundancy. This architecture ensures that:

  • Single Connection Point: Clients connect to any node in the cluster and gain access to the entire network’s data
  • Automatic Failover: If a node becomes unavailable, clients are seamlessly redirected to healthy nodes
  • Data Replication: Events are automatically replicated across cluster nodes using CockroachDB’s distributed consensus
  • Transparent Scaling: New nodes can be added to the cluster without client reconfiguration

Shugur Distributed Cluster

Load Balancer Layer

Client Connections

Distributed Database

Relay Nodes

Consensus Replication

Consensus Replication

Consensus Replication

Nostr Client 1

Nostr Client 2

Nostr Client 3

Load Balancer

Health Checks

Node 1

Gateway

Node 2

Gateway

Node 3

Gateway

CockroachDB

Node 1

CockroachDB

Node 2

CockroachDB

Node 3

Failover Behavior

  1. Health Monitoring: Each node continuously monitors cluster health
  2. Automatic Detection: Failed nodes are detected within seconds
  3. Traffic Rerouting: Load balancer redirects traffic to healthy nodes
  4. Data Availability: All data remains accessible through surviving nodes
  5. Transparent Recovery: Failed nodes can rejoin without data loss

Benefits Over Traditional Multi-Relay Setup

Traditional NostrShugur Distributed Cluster
Client manages multiple connectionsSingle connection to cluster
Manual failover logic requiredAutomatic failover
Data scattered across relaysUnified data across cluster
Complex relay discoverySimple gateway discovery
Potential data inconsistencyStrong consistency via CockroachDB

Core Components

1. Application Node (internal/application/)

The central coordinator that manages all relay components.

Responsibilities:

  • Component lifecycle management
  • Dependency injection
  • Configuration distribution
  • Graceful shutdown coordination

Key Files:

  • node.go - Main node implementation
  • node_builder.go - Builder pattern for node construction
  • node_utils.go - Utility functions

2. Relay Server (internal/relay/)

Handles the core Nostr protocol implementation.

Responsibilities:

  • WebSocket connection management
  • Nostr message processing
  • Event validation and storage
  • Subscription management

Key Files:

  • server.go - HTTP/WebSocket server
  • connection.go - WebSocket connection handling
  • subscription.go - Subscription management
  • event_validator.go - Event validation logic

3. Storage Layer (internal/storage/)

Manages data persistence and retrieval.

Responsibilities:

  • Database connection management
  • Event storage and retrieval
  • Query optimization
  • Bloom filter management

Key Files:

  • db.go - Database connection and operations
  • event_processor.go - Event processing pipeline
  • queries.go - Database queries
  • schema.go - Database schema management

4. NIPs Implementation (internal/relay/nips/)

Implements Nostr Improvement Proposals.

Responsibilities:

  • Protocol compliance validation
  • Event type specific processing
  • Feature-specific logic

Key Files:

  • nip01.go - Basic protocol flow
  • nip11.go - Relay information document
  • nip17.go - Private direct messages
  • And many more NIP implementations…

Design Patterns

1. Builder Pattern

Used for complex object construction, particularly the Node:

type NodeBuilder struct {
ctx context.Context
cfg *config.Config
privKey ed25519.PrivateKey
// ... other fields
}
func (b *NodeBuilder) BuildDB() error { /* ... */ }
func (b *NodeBuilder) BuildWorkers() { /* ... */ }
func (b *NodeBuilder) Build() *Node { /* ... */ }

2. Interface Segregation

Components depend on interfaces rather than concrete implementations:

type NodeInterface interface {
RegisterConn(conn domain.WebSocketConnection)
UnregisterConn(conn domain.WebSocketConnection)
GetActiveConnectionCount() int64
// ... other methods
}

3. Dependency Injection

Configuration and dependencies are injected through constructors:

func NewServer(relayCfg config.RelayConfig, node domain.NodeInterface, fullCfg *config.Config) *Server

4. Worker Pool Pattern

Background task processing using worker pools:

type WorkerPool struct {
workers chan chan Job
jobQueue chan Job
quit chan bool
}

Data Flow

Event Processing Pipeline

Database (CockroachDB)Relay ServerClient (WebSocket)Database (CockroachDB)Relay ServerClient (WebSocket)1. Parse JSON2. Validate Signature3. Check NIPs4. Rate Limit5. Policy CheckNotify Subscribers["EVENT", event]Store EventSuccess/Error["OK", id, bool, message]

Subscription Flow

Database (CockroachDB)Relay ServerClient (WebSocket)Database (CockroachDB)Relay ServerClient (WebSocket)1. Validate Filters2. Check Limits3. Store Subscriptionloop[For each matchingevent]["REQ", id, filters...]Query EventsEvent Results["EVENT", id, event]["EOSE", id]

Concurrency Model

Goroutine Usage

  1. Main Server Goroutine - HTTP server listener
  2. Connection Goroutines - One per WebSocket connection
  3. Worker Pool Goroutines - Background task processing
  4. Metrics Goroutines - Metrics collection and export
  5. Database Pool Goroutines - Connection pool management

Synchronization Mechanisms

1. Mutexes

Used for protecting shared state:

type WsConnection struct {
subscriptionsMu sync.RWMutex
subscriptions map[string]SubscriptionInfo
// ...
}

2. Channels

Used for communication between goroutines:

type WorkerPool struct {
jobQueue chan Job
workers chan chan Job
quit chan bool
}

3. Atomic Operations

Used for counters and flags:

type WsConnection struct {
isClosed atomic.Bool
// ...
}

Storage Architecture

Database Schema

-- Events table (main storage)
CREATE TABLE events (
id TEXT PRIMARY KEY,
pubkey TEXT NOT NULL,
created_at BIGINT NOT NULL,
kind INTEGER NOT NULL,
tags JSONB,
content TEXT,
sig TEXT NOT NULL,
INDEX idx_events_pubkey (pubkey),
INDEX idx_events_kind (kind),
INDEX idx_events_created_at (created_at),
INDEX idx_events_tags (tags)
);
-- Additional indexes for performance
CREATE INDEX idx_events_composite ON events (kind, created_at);
CREATE INDEX idx_events_search ON events USING gin(to_tsvector('english', content));

Query Optimization

1. Bloom Filters

Used for quick duplicate detection:

type DB struct {
Bloom *bloom.BloomFilter
// ...
}
// Check if event might exist
if !db.Bloom.TestString(eventID) {
// Definitely new event
return false
}

2. Connection Pooling

Optimized database connection management:

pool, err := pgxpool.New(ctx, dbURI)

3. Prepared Statements

Cached query plans for better performance.

Security Architecture

Input Validation

1. Event Validation Pipeline

func (ev *EventValidator) ValidateEvent(event *nostr.Event) error {
// 1. Structural validation
if err := ev.validateStructure(event); err != nil {
return err
}
// 2. Signature verification
if err := ev.validateSignature(event); err != nil {
return err
}
// 3. NIP-specific validation
if err := ev.validateWithNIPs(event); err != nil {
return err
}
return nil
}

2. Rate Limiting

Token bucket algorithm implementation:

type RateLimiter struct {
limiters sync.Map // map[string]*rate.Limiter
}
func (rl *RateLimiter) Allow(key string) bool {
limiter := rl.getLimiter(key)
return limiter.Allow()
}

Access Control

1. Blacklist/Whitelist System

type Node struct {
blacklistPubKeys map[string]struct{}
whitelistPubKeys map[string]struct{}
}

2. Progressive Banning

Escalating penalties for policy violations.

Performance Optimizations

Memory Management

1. Object Pooling

Reuse of frequently allocated objects:

var eventPool = sync.Pool{
New: func() interface{} {
return &nostr.Event{}
},
}

2. Bloom Filter Usage

Reduces database queries for duplicate detection.

3. Connection Caching

WebSocket connections are kept alive and reused.

CPU Optimization

1. Goroutine Pool

Limited number of worker goroutines to prevent resource exhaustion.

2. Efficient JSON Processing

Optimized JSON parsing and generation.

3. Signature Verification Caching

Cache results of expensive cryptographic operations.

Deployment Models

Standalone Deployment

Clients

Single Shugur Relay Node

Single CockroachDB Instance

  • Pros: Simple to set up and manage
  • Cons: Not highly available; single point of failure

Distributed Deployment

Clients

Load Balancer

Relay Node 1

Relay Node 2

Relay Node 3

CockroachDB Node 1

CockroachDB Node 2

CockroachDB Node 3

  • Pros: Highly available and horizontally scalable; tolerant to node failures
  • Cons: More complex to set up and manage

Monitoring and Observability

Metrics Collection

1. Prometheus Metrics

var (
EventsReceived = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "relay_events_received_total",
Help: "Total number of events received",
},
[]string{"result"},
)
)

2. Structured Logging

logger.Info("Event processed",
zap.String("event_id", event.ID),
zap.String("pubkey", event.PubKey),
zap.Int("kind", event.Kind),
)

Health Checks

1. Database Health

Regular database connectivity checks.

2. Memory Usage Monitoring

Track memory usage and garbage collection.

3. Connection Metrics

Monitor active connections and connection rates.

Error Handling

Error Classification

1. Validation Errors

Errors in event format or content:

type ValidationError struct {
Field string
Message string
}

2. System Errors

Infrastructure or runtime errors:

type SystemError struct {
Component string
Cause error
}

Error Recovery

1. Circuit Breaker Pattern

Automatic fallback when dependencies fail.

2. Graceful Degradation

Continue operating with reduced functionality.

3. Retry Logic

Exponential backoff for transient failures.

Scalability Considerations

Horizontal Scaling

1. Stateless Design

The relay can be horizontally scaled by running multiple instances.

2. Database Clustering

CockroachDB provides built-in clustering capabilities.

3. Load Balancing

WebSocket connections can be load balanced across instances.

Vertical Scaling

1. Resource Optimization

Efficient use of CPU, memory, and network resources.

2. Connection Limits

Configurable limits based on available resources.

3. Database Tuning

Optimized database configuration for the workload.

Extension Points

Plugin System (Future)

Architecture supports future plugin development:

type Plugin interface {
Init(ctx context.Context, cfg PluginConfig) error
ProcessEvent(event *nostr.Event) error
Shutdown() error
}

Custom NIPs

New NIP implementations can be added easily:

func RegisterNIP(kind int, validator func(*nostr.Event) error) {
nipValidators[kind] = validator
}

Custom Storage Backends

Storage layer can be extended with different backends.