notes/rabbitmq-v-kafka.md

9.4 KiB

Perbandingan RabbitMQ vs Apache Kafka

Use Case: Integrasi HC Portal, PMS, dan Genba Apps


1. Overview Kebutuhan Sistem

Perbandingan ini fokus pada analisis objektif antara RabbitMQ dan Apache Kafka untuk kebutuhan:

  • Integrasi 3 sistem (HC Portal, PMS, Genba Apps)
  • Sinkronisasi progress dan update project
  • Audit trail untuk aktivitas di PMS dan Genba
  • Notification service untuk berbagai channel (email, SMS, push notification)

Fokus Analisis: Cost saving dan efisiensi operasional


2. Cost Analysis Comparison

2.1 Infrastructure Cost

Komponen RabbitMQ Kafka
Minimum Production Setup 2 nodes cluster 3 brokers + Kraft
Server Requirements 2x (2 CPU, 8GB RAM, 100GB SSD) 3x (3 CPU, 8GB RAM, 200GB SSD)
Monthly Cloud Cost (AWS/GCP) $150-200/month $250-300/month
Storage Growth Linear (messages deleted after consumption) Exponential (retention-based 7 hari)
Network Transfer Lower (push model) Higher (pull + replication)
Monitoring Tools Management UI untuk sistem Open Source Kafka UI untuk debug sistem dan messages

2.2 Operational Cost

Aspek RabbitMQ Kafka
Learning Curve Easy Agak Sulit
Backup/Recovery Simple (export/import) Complex (partition management)
Upgrade Complexity Low (rolling upgrade) High (brokers harus dimatikan dan sistem harus mati)
Troubleshooting Straightforward Harus memahami paritition, brokers, dan replication

2.3 Development Cost

Area RabbitMQ Kafka
Initial Setup Time 1 days 3-7 days
Client Library Simple More complex configuration
Testing Environment Lightweight Docker setup Swarm/Compose/Kubernetes needed
Development Skills Common knowledge Specialized expertise
Documentation/Training 1-2 hour 1-2 days

Total Cost Comparison (1st Year):

  • RabbitMQ: ~$2,400 infra + $15,000 development = $17,400
  • Kafka: ~$7,200 infra + $25,000 development = $32,200

3. Technical Capability Comparison

3.1 Core Architecture

Aspek RabbitMQ Kafka
Message Model Queue-based (FIFO) Log-based (append-only)
Delivery Guarantee At-least-once, Exactly-once* At-least-once, Exactly-once
Message Ordering Per queue Per partition
Message Retention Until consumed (deleted) Time/Size-based (configurable)
Consumer Model Push (broker distributes) Pull (consumer fetches)
Routing Flexibility Complex (Exchange, Binding, Routing Key) Simple (Topic, Partition)

3.2 Performance Metrics

Metrik RabbitMQ Kafka
Throughput 20K-50K msg/sec per node 100K-1M msg/sec per broker
Latency Sub-millisecond (<1ms) 2-5ms average
Message Size Best for small messages (<1MB) Handles large messages well
RAM Usage Higher (stores in memory) Lower (disk-based)
CPU Usage Moderate Lower per message
Disk I/O Lower Higher (sequential writes)

3.3 Feature Support untuk Use Case

Audit Trail & Logging

Feature RabbitMQ Kafka
Long-term Storage Requires external DB (Clickhouse,etc) Native support with retention
Message Replay Not supported natively Full replay from any offset
Audit Query Must implement separately Can read historical data
Storage Cost Higher (need separate storage) Lower (built-in retention)
Compliance Manual implementation Built-in immutable log

Notification Service

Feature RabbitMQ Kafka
Priority Messages Built-in priority queues Manual implementation
Delayed Messages Plugin available Requires custom solution
Dead Letter Queue Native feature Manual implementation
TTL (Time To Live) Built-in Manual cleanup needed
Retry Mechanism Built-in with DLQ Consumer-side implementation

System Integration & Synchronization

Capability RabbitMQ Kafka
Request-Reply Pattern Native support Requires correlation
Event Sourcing Manual implementation Native pattern
CQRS Support Possible but complex Natural fit
Transactional Messaging Support with limitations Full transaction support
Multi-system Fan-out Via Exchange Via Consumer Groups

4. Implementation Comparison

4.1 Audit Trail Implementation

RabbitMQ Approach:

// Requires additional database for persistence
async function auditWithRabbitMQ(action) {
  // Send to queue
  channel.sendToQueue('audit-queue', Buffer.from(JSON.stringify(action)));
  
  // Consumer must save to database
  channel.consume('audit-queue', async (msg) => {
    const audit = JSON.parse(msg.content);
    await database.save(audit); // Additional storage needed
    channel.ack(msg);
  });
}
// Cost: Message broker + Database storage

Kafka Approach:

// Direct storage in Kafka with retention
async function auditWithKafka(action) {
  await producer.send({
    topic: 'audit-trail',
    messages: [{ value: JSON.stringify(action) }]
  });
  
  // Data retained in Kafka (e.g., 365 days)
  // No additional database required for audit storage
}
// Cost: Only Kafka storage

4.2 Notification Service Implementation

RabbitMQ Approach:

// Built-in routing and priority
await channel.publish('notifications', 'email.high', payload, {
  priority: 10,
  expiration: '3600000' // TTL 1 hour
});

// Automatic retry with Dead Letter Queue
await channel.assertQueue('email-queue', {
  arguments: {
    'x-dead-letter-exchange': 'dlx',
    'x-max-retries': 3
  }
});

Kafka Approach:

// Manual routing implementation
await producer.send({
  topic: 'notifications',
  messages: [{
    key: 'email',
    value: JSON.stringify(payload),
    headers: { priority: '10', channel: 'email' }
  }]
});

// Consumer must implement retry logic
const consumer = kafka.consumer({ groupId: 'notification-group' });
await consumer.run({
  eachMessage: async ({ message }) => {
    try {
      await processNotification(message);
    } catch (error) {
      // Manual retry implementation
      await retryQueue.add(message);
    }
  }
});

4.3 Project Synchronization

RabbitMQ Approach:

// Request-Reply pattern
const correlationId = uuid();
await channel.sendToQueue('rpc-queue', Buffer.from(data), {
  correlationId,
  replyTo: replyQueue
});
// Waits for direct response

Kafka Approach:

// Event-driven pattern
await producer.send({
  topic: 'project-events',
  messages: [{ 
    key: projectId,
    value: JSON.stringify({ event: 'updated', data })
  }]
});
// All systems consume independently

5. Scalability & Maintenance Comparison

5.1 Scaling Characteristics

Aspek RabbitMQ Kafka
Vertical Scaling Effective up to certain limit Limited benefit
Horizontal Scaling Add nodes to cluster Add brokers + partition
Auto-scaling Complex (manual rebalancing) Better (automatic rebalancing)
Scaling Cost Linear (add nodes) Higher initial, better at scale
Performance at Scale Degrades with queue depth Consistent performance

5.2 Maintenance Requirements

Task RabbitMQ Kafka
Backup Export definitions + messages Partition replica + MirrorMaker
Recovery Time Minutes to hours Hours to days (large data)
Monitoring Complexity Simple metrics Complex (lag, ISR, etc.)
Troubleshooting Clear error messages Requires deep knowledge
Version Upgrade Usually smooth Careful planning needed
Data Cleanup Automatic (after consumption) Manual (retention policy)

5.3 Resource Utilization Over Time

Small Scale (< 10K msg/day):

  • RabbitMQ: 2GB RAM, 2 CPU cores, 50GB storage
  • Kafka: 8GB RAM, 4 CPU cores, 200GB storage

Medium Scale (10K-100K msg/day):

  • RabbitMQ: 8GB RAM, 4 CPU cores, 200GB storage
  • Kafka: 16GB RAM, 8 CPU cores, 1TB storage

Large Scale (> 1M msg/day):

  • RabbitMQ: 32GB RAM, 16 CPU cores, 1TB storage
  • Kafka: 32GB RAM, 16 CPU cores, 5TB+ storage

7. Monitoring & Observability

7.1 Metrics untuk Monitor

Kafka Metrics:

  • Consumer lag per partition
  • Message throughput
  • Disk usage & retention
  • Replication status

RabbitMQ Metrics:

  • Queue depth
  • Consumer utilization
  • Message rates (publish/deliver/ack)
  • Connection count

7.2 Tools Rekomendasi

Kafka Monitoring:
- Prometheus + Grafana
- Kafka UI

RabbitMQ Monitoring:
- RabbitMQ Management Plugin
- Prometheus RabbitMQ Exporter
- Grafana Dashboard

Distributed Tracing:
- Jaeger atau Zipkin
- Correlation ID across systems

8. Estimasi Resource & Cost

8.1 Infrastructure Requirements

Kafka Cluster (Production):
- 3 Brokers (4 CPU, 16GB RAM, 500GB SSD each)
- Estimated: $250-300/month (cloud)

RabbitMQ Cluster:
- 2 Nodes (2 CPU, 8GB RAM, 100GB SSD each)
- HAProxy Load Balancer
- Estimated: $150-200/month (cloud)

Total Infrastructure: ~$600-800/month

Document Version: 1.0
Last Updated: October 2024
Author: System Architecture Team