Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
137 changes: 131 additions & 6 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,99 @@ The demo provides a solid foundation with:
- ✅ Customer service layer
- ✅ Error handling and validation
- ✅ Unit tests
- ✅ **Distributed Supervisor with Horde-like functionality**
- ✅ **OTP Actor system with proper supervision**
- ✅ **Consistent hashing for customer distribution**
- ✅ **Cluster membership and node monitoring**

## Distributed System Features (Horde-like Implementation)

### 1. Consistent Hashing

The distributed supervisor uses consistent hashing to distribute customer actors across cluster nodes:

```gleam
// Hash ring distributes customers evenly
let hash_ring = build_hash_ring(config.ring_size, active_nodes)
let target_node = get_node_for_customer(customer_id, hash_ring)
```

### 2. Node Discovery and Monitoring

```gleam
pub type DistributedSupervisorMessage {
NodeJoined(String)
NodeLeft(String)
Rebalance
GetClusterStatus(reply_with: Subject(ClusterStatus))
}
```

### 3. Automatic Failover

When a node leaves the cluster:
- Customer actors are automatically redistributed
- Hash ring is rebuilt to maintain consistency
- Orphaned customers are detected and reassigned

### 4. Load Balancing

- Virtual nodes in hash ring ensure even distribution
- Customer placement based on consistent hashing algorithm
- Dynamic rebalancing when cluster topology changes

### 5. Cluster Status Monitoring

```gleam
pub type ClusterStatus {
ClusterStatus(
current_node: String,
active_nodes: List(String),
customer_distribution: Dict(String, Int)
)
}
```

This provides real-time visibility into:
- Active cluster members
- Customer distribution across nodes
- System health and load balance

### 6. Graceful Node Shutdown with Actor Migration

The distributed supervisor supports graceful shutdown of nodes with automatic actor migration:

```gleam
// Gracefully shutdown a node, migrating all actors to other nodes
pub fn graceful_shutdown(supervisor: Subject(DistributedSupervisorMessage), node_name: String) -> Result(Nil, String)

// Message types for graceful operations
pub type DistributedSupervisorMessage {
GracefulShutdown(String, reply_with: Subject(Result(Nil, String)))
MigrateActors(from_node: String, to_node: String, actor_ids: List(Int))
// ... other messages
}

// Customer actors support state extraction and restoration for migration
pub type CustomerActorMessage {
ExtractState(reply_with: Subject(Result(Option(Customer), String)))
RestoreState(Option(Customer), reply_with: Subject(Result(Nil, String)))
// ... other messages
}
```

**Graceful Shutdown Process:**
1. **State Extraction**: Extract state from all customer actors on the shutting down node
2. **Target Selection**: Identify healthy nodes to receive migrated actors
3. **Actor Migration**: Create new actors on target nodes with restored state
4. **Cleanup**: Stop actors on the shutting down node
5. **Topology Update**: Remove the node from the cluster and update hash ring

**Benefits:**
- **Zero Downtime**: Customer actors remain available during node shutdown
- **State Preservation**: Customer data is maintained across migrations
- **Automatic Load Balancing**: Actors are redistributed according to consistent hashing
- **Fault Tolerance**: System continues operating with reduced capacity

## Full Production Architecture

Expand Down Expand Up @@ -40,8 +133,16 @@ pub type CustomerActor {
```gleam
// src/app_supervisor.gleam
import gleam/otp/supervisor
import distributed_supervisor

pub fn start_application() {
// Start with distributed supervisor that provides Horde-like functionality
let config = distributed_supervisor.default_config()
distributed_supervisor.start(config)
}

// Traditional supervisor approach is also available
pub fn start_with_supervisor() {
supervisor.start_spec(
supervisor.Spec(
argument: Nil,
Expand All @@ -50,12 +151,10 @@ pub fn start_application() {
init: fn(_) {
supervisor.Ready(
children: [
// Database connection pool
database_supervisor_spec(),
// Customer actor registry
customer_registry_spec(),
// Web server
web_server_spec(),
// Distributed supervisor as a child
supervisor.worker(fn() {
distributed_supervisor.start(distributed_supervisor.default_config())
})
],
restart: supervisor.OneForOne
)
Expand All @@ -65,6 +164,32 @@ pub fn start_application() {
}
```

### 2.1. Distributed Supervisor (Horde-like Implementation)

```gleam
// src/distributed_supervisor.gleam
import gleam/otp/supervisor
import gleam/otp/actor
import gleam/dict

pub type DistributedSupervisorState {
DistributedSupervisorState(
config: DistributedConfig,
active_nodes: List(String),
hash_ring: Dict(Int, String),
customer_actors: Dict(String, Dict(Int, Subject(CustomerActorMessage))),
local_supervisor: Subject(supervisor.Message)
)
}

// Features:
// - Consistent hashing for customer distribution
// - Automatic node discovery and monitoring
// - Customer actor migration on topology changes
// - Fault tolerance with automatic failover
// - Load balancing across cluster nodes
```

### 3. Database Layer with SQLite

```gleam
Expand Down
55 changes: 52 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,33 @@ test/
- Error propagation and handling
- Functional API design

### ✅ **Distributed Supervisor with Horde-like Functionality**
- **Consistent Hashing**: Distributes customer actors across cluster nodes using hash rings
- **Node Discovery**: Automatic detection and monitoring of cluster members
- **Fault Tolerance**: Automatic failover and actor migration when nodes fail
- **Load Balancing**: Even distribution of customer actors across available nodes
- **Cluster Monitoring**: Real-time visibility into cluster status and actor distribution
- **Graceful Shutdown**: Zero-downtime node shutdown with automatic actor migration ✨

### ✅ **OTP Actor System**
- Proper OTP actors for customer management with `gleam_otp`
- Supervisor trees with fault tolerance and restart strategies
- Actor lifecycle management with proper supervision
- Message passing between distributed components
- State extraction and restoration for seamless actor migration ✨

### ✅ **Configuration Management**
- Environment-based configuration for distributed operation
- Support for both legacy and distributed modes
- Flexible cluster configuration (nodes, hash ring size, discovery intervals)

### 🔄 **Planned Enhancements (Full Production Version)**

To make this a complete production application, add:

1. **Real OTP Actors**:
1. **Real OTP Actors**: ✅ **COMPLETED**
```gleam
// Add dependency: gleam_otp = ">= 0.10.0 and < 1.0.0"
// Added dependency: gleam_otp = ">= 0.10.0 and < 1.0.0"
import gleam/otp/actor
import gleam/otp/supervisor
```
Expand Down Expand Up @@ -103,19 +123,48 @@ curl -X POST -H "Content-Type: application/json" \

## 🏃‍♂️ **Running the Demo**

### Legacy Mode (Default)
```bash
# Install Gleam (if not already installed)
curl -sSL https://github.com/gleam-lang/gleam/releases/download/v1.5.1/gleam-v1.5.1-x86_64-unknown-linux-musl.tar.gz -o gleam.tar.gz
tar -xzf gleam.tar.gz
sudo mv gleam /usr/local/bin/

# Run the demonstration
# Run the legacy demonstration
gleam run

# Run tests
gleam test
```

### Distributed Mode (New!)
```bash
# Run with distributed supervisor (Horde-like functionality)
export DISTRIBUTED_MODE=true
gleam run

# Configure cluster nodes (optional)
export CLUSTER_NODES="node1@localhost,node2@localhost,node3@localhost"
export HASH_RING_SIZE=512
export NODE_DISCOVERY_INTERVAL=3000
gleam run

# Run distributed tests
gleam test -- --module distributed_supervisor_test
```

### Configuration Options

Environment variables for distributed operation:
- `DISTRIBUTED_MODE`: Set to "true" or "1" to enable distributed mode
- `CLUSTER_NODES`: Comma-separated list of cluster node names
- `HASH_RING_SIZE`: Size of the consistent hash ring (default: 256)
- `NODE_DISCOVERY_INTERVAL`: Node discovery interval in milliseconds (default: 5000)
- `DATABASE_URL`: Database connection string (default: "./customers.db")
- `PORT`: Application port (default: 8080)
- `LOG_LEVEL`: Logging level (default: "info")
- `MAX_CONNECTIONS`: Maximum database connections (default: 100)

## 🎯 **Demo Output**

The demo application showcases:
Expand Down
2 changes: 1 addition & 1 deletion build/packages/gleam.lock
Original file line number Diff line number Diff line change
@@ -1 +1 @@
4844
3701
21 changes: 18 additions & 3 deletions demo.sh
Original file line number Diff line number Diff line change
Expand Up @@ -62,16 +62,20 @@ echo ""
echo "🔄 Production Expansion:"
echo "To convert this demo to a full production application:"
echo ""
echo "1. Add OTP Dependencies:"
echo "1. Add OTP Dependencies: ✅ COMPLETED"
echo " gleam_otp = \">= 0.10.0 and < 1.0.0\""
echo " wisp = \">= 0.12.0 and < 1.0.0\""
echo " mist = \">= 1.2.0 and < 2.0.0\""
echo " sqlight = \">= 0.15.0 and < 1.0.0\""
echo ""
echo "2. Replace customer_actor.gleam with real OTP actors"
echo "2. Replace customer_actor.gleam with real OTP actors ✅ COMPLETED"
echo "3. Add REST API handlers with Wisp framework"
echo "4. Replace in-memory database with SQLite"
echo "5. Add supervisor tree for fault tolerance"
echo "5. Add supervisor tree for fault tolerance ✅ COMPLETED"
echo "6. Add distributed supervisor with Horde-like functionality ✅ COMPLETED"
echo ""
echo "🌐 NEW: Distributed Mode Available!"
echo "Run: DISTRIBUTED_MODE=true gleam run"
echo ""

echo "📡 API Endpoints (Designed):"
Expand All @@ -97,9 +101,20 @@ echo "• Actor-like service pattern (simplified)"
echo "• Database abstraction layer"
echo "• Comprehensive testing strategy"
echo "• Production-ready architecture design"
echo "• Distributed supervisor with Horde-like functionality ✅"
echo "• OTP actor system with proper supervision ✅"
echo "• Consistent hashing for load distribution ✅"
echo "• Cluster membership and fault tolerance ✅"
echo ""

echo "🎉 This provides a solid foundation for building distributed,"
echo "fault-tolerant customer management systems with Gleam!"
echo ""
echo "🌐 NEW FEATURES:"
echo "• Distributed supervisor inspired by Horde"
echo "• Consistent hashing for actor distribution"
echo "• Automatic node discovery and monitoring"
echo "• Fault-tolerant cluster operations"
echo "• Real-time cluster status monitoring"
echo ""
echo "See README.md and ARCHITECTURE.md for complete details."
Loading