richardcase · Copilot · Sep 22, 2025 · Sep 22, 2025 · Sep 22, 2025 · Sep 22, 2025
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
@@ -11,6 +11,99 @@ The demo provides a solid foundation with:
 - ✅ Customer service layer
 - ✅ Error handling and validation
 - ✅ Unit tests
+- ✅ **Distributed Supervisor with Horde-like functionality**
+- ✅ **OTP Actor system with proper supervision**
+- ✅ **Consistent hashing for customer distribution**
+- ✅ **Cluster membership and node monitoring**
+
+## Distributed System Features (Horde-like Implementation)
+
+### 1. Consistent Hashing
+
+The distributed supervisor uses consistent hashing to distribute customer actors across cluster nodes:
+
+```gleam
+// Hash ring distributes customers evenly
+let hash_ring = build_hash_ring(config.ring_size, active_nodes)
+let target_node = get_node_for_customer(customer_id, hash_ring)
+```
+
+### 2. Node Discovery and Monitoring
+
+```gleam
+pub type DistributedSupervisorMessage {
+  NodeJoined(String)
+  NodeLeft(String)  
+  Rebalance
+  GetClusterStatus(reply_with: Subject(ClusterStatus))
+}
+```
+
+### 3. Automatic Failover
+
+When a node leaves the cluster:
+- Customer actors are automatically redistributed
+- Hash ring is rebuilt to maintain consistency
+- Orphaned customers are detected and reassigned
+
+### 4. Load Balancing
+
+- Virtual nodes in hash ring ensure even distribution
+- Customer placement based on consistent hashing algorithm
+- Dynamic rebalancing when cluster topology changes
+
+### 5. Cluster Status Monitoring
+
+```gleam
+pub type ClusterStatus {
+  ClusterStatus(
+    current_node: String,
+    active_nodes: List(String),
+    customer_distribution: Dict(String, Int)
+  )
+}
+```
+
+This provides real-time visibility into:
+- Active cluster members
+- Customer distribution across nodes
+- System health and load balance
+
+### 6. Graceful Node Shutdown with Actor Migration
+
+The distributed supervisor supports graceful shutdown of nodes with automatic actor migration:
+
+```gleam
+// Gracefully shutdown a node, migrating all actors to other nodes
+pub fn graceful_shutdown(supervisor: Subject(DistributedSupervisorMessage), node_name: String) -> Result(Nil, String)
+
+// Message types for graceful operations
+pub type DistributedSupervisorMessage {
+  GracefulShutdown(String, reply_with: Subject(Result(Nil, String)))
+  MigrateActors(from_node: String, to_node: String, actor_ids: List(Int))
+  // ... other messages
+}
+
+// Customer actors support state extraction and restoration for migration
+pub type CustomerActorMessage {
+  ExtractState(reply_with: Subject(Result(Option(Customer), String)))
+  RestoreState(Option(Customer), reply_with: Subject(Result(Nil, String)))
+  // ... other messages
+}
+```
+
+**Graceful Shutdown Process:**
+1. **State Extraction**: Extract state from all customer actors on the shutting down node
+2. **Target Selection**: Identify healthy nodes to receive migrated actors
+3. **Actor Migration**: Create new actors on target nodes with restored state
+4. **Cleanup**: Stop actors on the shutting down node
+5. **Topology Update**: Remove the node from the cluster and update hash ring
+
+**Benefits:**
+- **Zero Downtime**: Customer actors remain available during node shutdown
+- **State Preservation**: Customer data is maintained across migrations
+- **Automatic Load Balancing**: Actors are redistributed according to consistent hashing
+- **Fault Tolerance**: System continues operating with reduced capacity
 
 ## Full Production Architecture
 
@@ -40,8 +133,16 @@ pub type CustomerActor {
 ```gleam
 // src/app_supervisor.gleam
 import gleam/otp/supervisor
+import distributed_supervisor
 
 pub fn start_application() {
+  // Start with distributed supervisor that provides Horde-like functionality
+  let config = distributed_supervisor.default_config()
+  distributed_supervisor.start(config)
+}
+
+// Traditional supervisor approach is also available
+pub fn start_with_supervisor() {
   supervisor.start_spec(
     supervisor.Spec(
       argument: Nil,
@@ -50,12 +151,10 @@ pub fn start_application() {
       init: fn(_) {
         supervisor.Ready(
           children: [
-            // Database connection pool
-            database_supervisor_spec(),
-            // Customer actor registry
-            customer_registry_spec(),
-            // Web server
-            web_server_spec(),
+            // Distributed supervisor as a child
+            supervisor.worker(fn() {
+              distributed_supervisor.start(distributed_supervisor.default_config())
+            })
           ],
           restart: supervisor.OneForOne
         )
@@ -65,6 +164,32 @@ pub fn start_application() {
 }
 ```
 
+### 2.1. Distributed Supervisor (Horde-like Implementation)
+
+```gleam
+// src/distributed_supervisor.gleam
+import gleam/otp/supervisor
+import gleam/otp/actor
+import gleam/dict
+
+pub type DistributedSupervisorState {
+  DistributedSupervisorState(
+    config: DistributedConfig,
+    active_nodes: List(String),
+    hash_ring: Dict(Int, String),
+    customer_actors: Dict(String, Dict(Int, Subject(CustomerActorMessage))),
+    local_supervisor: Subject(supervisor.Message)
+  )
+}
+
+// Features:
+// - Consistent hashing for customer distribution
+// - Automatic node discovery and monitoring  
+// - Customer actor migration on topology changes
+// - Fault tolerance with automatic failover
+// - Load balancing across cluster nodes
+```
+
 ### 3. Database Layer with SQLite
 
 ```gleam

diff --git a/README.md b/README.md
@@ -44,13 +44,33 @@ test/
 - Error propagation and handling
 - Functional API design
 
+### ✅ **Distributed Supervisor with Horde-like Functionality** 
+- **Consistent Hashing**: Distributes customer actors across cluster nodes using hash rings
+- **Node Discovery**: Automatic detection and monitoring of cluster members
+- **Fault Tolerance**: Automatic failover and actor migration when nodes fail
+- **Load Balancing**: Even distribution of customer actors across available nodes
+- **Cluster Monitoring**: Real-time visibility into cluster status and actor distribution
+- **Graceful Shutdown**: Zero-downtime node shutdown with automatic actor migration ✨
+
+### ✅ **OTP Actor System**
+- Proper OTP actors for customer management with `gleam_otp`
+- Supervisor trees with fault tolerance and restart strategies
+- Actor lifecycle management with proper supervision
+- Message passing between distributed components
+- State extraction and restoration for seamless actor migration ✨
+
+### ✅ **Configuration Management**
+- Environment-based configuration for distributed operation
+- Support for both legacy and distributed modes
+- Flexible cluster configuration (nodes, hash ring size, discovery intervals)
+
 ### 🔄 **Planned Enhancements (Full Production Version)**
 
 To make this a complete production application, add:
 
-1. **Real OTP Actors**:
+1. **Real OTP Actors**: ✅ **COMPLETED**
    ```gleam
-   // Add dependency: gleam_otp = ">= 0.10.0 and < 1.0.0"
+   // Added dependency: gleam_otp = ">= 0.10.0 and < 1.0.0"
    import gleam/otp/actor
    import gleam/otp/supervisor
    ```
@@ -103,19 +123,48 @@ curl -X POST -H "Content-Type: application/json" \
 
 ## 🏃‍♂️ **Running the Demo**
 
+### Legacy Mode (Default)
 ```bash
 # Install Gleam (if not already installed)
 curl -sSL https://github.com/gleam-lang/gleam/releases/download/v1.5.1/gleam-v1.5.1-x86_64-unknown-linux-musl.tar.gz -o gleam.tar.gz
 tar -xzf gleam.tar.gz
 sudo mv gleam /usr/local/bin/
 
-# Run the demonstration
+# Run the legacy demonstration
 gleam run
 
 # Run tests
 gleam test
 ```
 
+### Distributed Mode (New!)
+```bash
+# Run with distributed supervisor (Horde-like functionality)
+export DISTRIBUTED_MODE=true
+gleam run
+
+# Configure cluster nodes (optional)
+export CLUSTER_NODES="node1@localhost,node2@localhost,node3@localhost"
+export HASH_RING_SIZE=512
+export NODE_DISCOVERY_INTERVAL=3000
+gleam run
+
+# Run distributed tests
+gleam test -- --module distributed_supervisor_test
+```
+
+### Configuration Options
+
+Environment variables for distributed operation:
+- `DISTRIBUTED_MODE`: Set to "true" or "1" to enable distributed mode
+- `CLUSTER_NODES`: Comma-separated list of cluster node names
+- `HASH_RING_SIZE`: Size of the consistent hash ring (default: 256)
+- `NODE_DISCOVERY_INTERVAL`: Node discovery interval in milliseconds (default: 5000)
+- `DATABASE_URL`: Database connection string (default: "./customers.db") 
+- `PORT`: Application port (default: 8080)
+- `LOG_LEVEL`: Logging level (default: "info")
+- `MAX_CONNECTIONS`: Maximum database connections (default: 100)
+
 ## 🎯 **Demo Output**
 
 The demo application showcases:

diff --git a/build/packages/gleam.lock b/build/packages/gleam.lock
@@ -1 +1 @@
-4844
+3701
diff --git a/demo.sh b/demo.sh
@@ -62,16 +62,20 @@ echo ""
 echo "🔄 Production Expansion:"
 echo "To convert this demo to a full production application:"
 echo ""
-echo "1. Add OTP Dependencies:"
+echo "1. Add OTP Dependencies: ✅ COMPLETED"
 echo "   gleam_otp = \">= 0.10.0 and < 1.0.0\""
 echo "   wisp = \">= 0.12.0 and < 1.0.0\""
 echo "   mist = \">= 1.2.0 and < 2.0.0\""
 echo "   sqlight = \">= 0.15.0 and < 1.0.0\""
 echo ""
-echo "2. Replace customer_actor.gleam with real OTP actors"
+echo "2. Replace customer_actor.gleam with real OTP actors ✅ COMPLETED"
 echo "3. Add REST API handlers with Wisp framework"
 echo "4. Replace in-memory database with SQLite"
-echo "5. Add supervisor tree for fault tolerance"
+echo "5. Add supervisor tree for fault tolerance ✅ COMPLETED"
+echo "6. Add distributed supervisor with Horde-like functionality ✅ COMPLETED"
+echo ""
+echo "🌐 NEW: Distributed Mode Available!"
+echo "Run: DISTRIBUTED_MODE=true gleam run"
 echo ""
 
 echo "📡 API Endpoints (Designed):"
@@ -97,9 +101,20 @@ echo "• Actor-like service pattern (simplified)"
 echo "• Database abstraction layer"
 echo "• Comprehensive testing strategy"
 echo "• Production-ready architecture design"
+echo "• Distributed supervisor with Horde-like functionality ✅"
+echo "• OTP actor system with proper supervision ✅"
+echo "• Consistent hashing for load distribution ✅"
+echo "• Cluster membership and fault tolerance ✅"
 echo ""
 
 echo "🎉 This provides a solid foundation for building distributed,"
 echo "fault-tolerant customer management systems with Gleam!"
 echo ""
+echo "🌐 NEW FEATURES:"
+echo "• Distributed supervisor inspired by Horde"
+echo "• Consistent hashing for actor distribution"  
+echo "• Automatic node discovery and monitoring"
+echo "• Fault-tolerant cluster operations"
+echo "• Real-time cluster status monitoring"
+echo ""
 echo "See README.md and ARCHITECTURE.md for complete details."