Logging Infrastructure

Overview

Postchi's logging infrastructure is designed to handle millions of requests per day across multiple products (Email Sending and Email Validation/Stamp) with minimal performance impact. The system uses a tiered storage approach optimized for cost, speed, and scalability.

Architecture

Storage Tiers

┌─────────────┐
│   Request   │
└──────┬──────┘
       │
       ▼
┌─────────────────────────────────────┐
│  Application (API/Worker/Stamp)     │
│  - Logs event immediately           │
│  - Non-blocking Promise.all()       │
└──────┬──────────────────────────────┘
       │
       ├─────────────────┬──────────────────┐
       ▼                 ▼                  ▼
┌─────────────┐   ┌────────────┐   ┌──────────────┐
│ Redis       │   │ Redis      │   │ Redis        │
│ (Hot Logs)  │   │ (Queue)    │   │ (Usage)      │
│ Last 1000   │   │ Archive    │   │ Counters     │
│ per org     │   │ Queue      │   │ Daily/Month  │
│ 7 days TTL  │   │ FIFO       │   │ Auto-expire  │
└─────────────┘   └─────┬──────┘   └──────────────┘
                        │
                        ▼
                  ┌─────────────┐
                  │ Worker      │
                  │ (Every min) │
                  │ Batch: 500  │
                  └──────┬──────┘
                         │
                         ▼
                  ┌─────────────┐
                  │ S3/R2       │
                  │ Archive     │
                  │ NDJSON      │
                  │ Infinite    │
                  └─────────────┘

Storage Strategy

Storage	Purpose	Retention	Access Speed	Cost
Redis (Hot)	Dashboard display, recent logs	Last 1000 logs per org, 7 days	<10ms	High
Redis (Queue)	Archive queue before S3 write	Until processed	<10ms	High
Redis (Usage)	Daily/monthly counters	35 days auto-expire	<10ms	High
S3/R2 (Archive)	Long-term storage, compliance	Infinite (configurable)	100-500ms	Very Low

Phase 1 vs Phase 2

Phase 1 (Current Implementation):

Redis for hot logs and queues
S3/R2 for cold storage
SQL only for aggregated usage (hourly/daily rollups)

Phase 2 (Future):

Add ClickHouse for efficient log queries
Better analytics and time-range queries
Still use S3 for archive/compliance

Components

1. Shared Logging Module (`@postchi/shared`)

Located in packages/shared/src/logging/

Files:

types.ts - TypeScript types for all log entries
redis-client.ts - Singleton Redis client (db 2)
s3-client.ts - S3/R2 client configuration
usage-meter.ts - Usage counting and quota enforcement
log-writer.ts - Store logs in Redis and queue for S3
index.ts - Public API exports

Key Functions:

// Initialize clients (call once on startup)
Logging.initRedisClient({ host, port, password, db: 2 });
Logging.initS3Client({ region, endpoint, bucket, credentials });

// Store logs (non-blocking)
await Logging.storeHotLog(logEntry);
await Logging.queueLogForArchive(logEntry);
await Logging.incrementUsage(orgId, ProductType.EMAIL_VALIDATION);

// Retrieve logs
const logs = await Logging.getHotLogs(orgId, product, limit, offset);
const count = await Logging.getHotLogsCount(orgId, product);
const usage = await Logging.getUsage(orgId, product);

2. Product Types

enum ProductType {
  EMAIL_SENDING = "EMAIL_SENDING",
  EMAIL_VALIDATION = "EMAIL_VALIDATION",
}

3. Log Entry Types

Email Validation Log

interface ValidationLogEntry {
  id: string;                  // UUID
  timestamp: Date;
  organizationId: string;
  apiKeyId: string;
  product: ProductType.EMAIL_VALIDATION;

  // Request data
  email: string;
  options?: {
    checkFormat?: boolean;
    checkMx?: boolean;
    checkSmtp?: boolean;
    checkDisposable?: boolean;
    checkCatchAll?: boolean;
    timeout?: number;
  };

  // Result data
  valid: boolean;
  reason: string;
  details: {
    formatValid?: boolean;
    mxExists?: boolean;
    smtpValid?: boolean;
    disposable?: boolean;
    catchAll?: boolean;
    smtpCode?: number;
    smtpMessage?: string;
  };

  // Performance
  duration: number;             // milliseconds

  // Storage
  s3Key?: string;              // Set after archiving
}

Email Sending Log

interface EmailSendingLogEntry {
  id: string;
  timestamp: Date;
  organizationId: string;
  apiKeyId: string;
  product: ProductType.EMAIL_SENDING;

  // Message data
  messageId: string;            // Postchi message ID
  from: string;
  to: string[];
  cc?: string[];
  bcc?: string[];
  subject: string;
  templateId?: string;
  tags?: string[];
  metadata?: Record<string, any>;

  // Status
  status: 'queued' | 'processing' | 'sent' | 'failed' | 'bounced';

  // Performance
  duration: number;

  // Storage
  s3Key?: string;
}

Implementation Locations

Stamp (Email Validator)

Location: /packages/api/src → WAIT, THIS IS WRONG

Actually: postchi-email-validator/ (separate repository)

Files Modified:

src/config.ts - Added Redis/R2 configuration
src/logging.ts - Initialize logging clients
src/server.ts - Call initializeLogging() on startup
src/routes/validate.ts - Log validation requests

Logging Flow:

// In validation endpoint (src/routes/validate.ts:89-112)
const logEntry = {
  id: randomUUID(),
  timestamp: new Date(),
  organizationId: 'demo-org',  // TODO: Get from auth
  apiKeyId: 'demo-key',        // TODO: Get from auth
  product: ProductType.EMAIL_VALIDATION,
  email,
  options,
  valid: result.valid,
  reason: result.reason,
  details: result.details,
  duration,
};

// Non-blocking logging
Promise.all([
  storeHotLog(logEntry),
  queueLogForArchive(logEntry),
  incrementUsage(organizationId, ProductType.EMAIL_VALIDATION),
]).catch((error) => {
  request.log.error({ error, logEntry }, 'Failed to store validation log');
  // Don't fail the request if logging fails
});

Postchi API

Location: packages/api/src/

Files Modified:

src/config/env.ts - Added REDIS_DB_LOGGING
src/index.ts - Initialize logging on startup
src/services/logs.service.ts - Business logic for fetching logs
src/api/controllers/logs.controller.ts - HTTP handlers
src/api/routes/logs.routes.ts - API routes
src/api/routes/index.ts - Registered /logs routes

Postchi Worker

Location: packages/worker/src/

Files Modified:

src/config/env.ts - Added Redis/R2 configuration
src/index.ts - Initialize logging on startup
src/workers/email.worker.ts - Log email sends (success + failure)
src/workers/log-archiver.worker.ts - NEW - Archives logs to S3

Email Worker Logging:

// After successful SMTP send (email.worker.ts:358-390)
const logEntry: Logging.EmailSendingLogEntry = {
  id: randomUUID(),
  timestamp: new Date(),
  organizationId: data.organizationId,
  apiKeyId: 'worker-send',
  product: Logging.ProductType.EMAIL_SENDING,
  messageId: data.messageId,
  from: data.from.email,
  to: data.to,
  subject,
  status: 'sent',
  duration: sendDuration,
  // ... metadata
};

// Non-blocking
Promise.all([
  Logging.storeHotLog(logEntry),
  Logging.queueLogForArchive(logEntry),
  Logging.incrementUsage(organizationId, ProductType.EMAIL_SENDING),
]).catch((error) => {
  logger.error({ error, messageId }, 'Failed to store email sending log');
});

Log Archiver Worker:

Runs every minute via BullMQ scheduled job
Fetches 500 logs at a time from Redis queue
Groups by organization and date
Writes to S3/R2 as newline-delimited JSON

Scheduler:

src/services/log-archiver-scheduler.service.ts - Creates repeatable job
Registered in src/index.ts on API startup

S3/R2 Storage Structure

logs/
  EMAIL_VALIDATION/
    {organizationId}/
      2026/
        02/
          21/
            1708531200000-a1b2c3d4.json
            1708531260000-e5f6g7h8.json
  EMAIL_SENDING/
    {organizationId}/
      2026/
        02/
          21/
            1708531200000-i9j0k1l2.json

File Format: Newline-delimited JSON (NDJSON)

{"id":"log-1","timestamp":"2026-02-21T00:00:00.000Z","email":"user@example.com",...}
{"id":"log-2","timestamp":"2026-02-21T00:00:05.000Z","email":"test@example.com",...}

Benefits:

Organization isolation (easy to delete org data for GDPR)
Time-based partitioning (efficient date range queries)
Small files (better performance than giant files)
NDJSON format (easy to stream and process line-by-line)

API Endpoints

Get Logs

Generic Endpoint:

GET /v1/logs?product=EMAIL_VALIDATION&limit=100&offset=0
Authorization: Bearer <token>

Convenience Endpoints:

GET /v1/logs/validation?limit=100&offset=0
GET /v1/logs/sending?limit=100&offset=0

Response:

{
  "success": true,
  "data": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "timestamp": 1708531200000,
      "email": "user@example.com",
      "valid": true,
      "reason": "valid",
      "duration": 150
    }
  ],
  "pagination": {
    "total": 1000,
    "limit": 100,
    "offset": 0,
    "hasMore": true
  },
  "meta": {
    "product": "EMAIL_VALIDATION"
  }
}

Get Usage Statistics

Single Product:

GET /v1/logs/usage?product=EMAIL_VALIDATION

Response:

{
  "success": true,
  "data": {
    "dailyUsage": 1234,
    "monthlyUsage": 45678,
    "product": "EMAIL_VALIDATION"
  }
}

All Products:

GET /v1/logs/usage/all

Response:

{
  "success": true,
  "data": {
    "validation": {
      "dailyUsage": 1234,
      "monthlyUsage": 45678,
      "product": "EMAIL_VALIDATION"
    },
    "sending": {
      "dailyUsage": 5678,
      "monthlyUsage": 123456,
      "product": "EMAIL_SENDING"
    }
  }
}

Environment Variables

Required for All Services

# Redis (Logging)
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=optional
REDIS_DB_LOGGING=2              # Use db 2 for logging

# Cloudflare R2 (Archive Storage)
R2_ACCOUNT_ID=your-account-id
R2_ACCESS_KEY_ID=your-access-key
R2_SECRET_ACCESS_KEY=your-secret
R2_BUCKET_NAME=postchi-logs

Performance Characteristics

Write Performance

Redis hot log write: <1ms
Redis queue push: <1ms
Redis usage counter increment: <1ms
Total logging overhead: <5ms (all async, non-blocking)

Read Performance

Dashboard (last 100 logs): <10ms (Redis)
Archive query (S3): 100-500ms (depends on file size)

Cost Optimization

Redis only stores last 1000 logs per org (compact format)
Auto-expire keys after 7 days
S3 storage: ~$0.015/GB/month (Cloudflare R2)
Estimated: $50-100/month for 10M logs/month

Scaling Considerations

Current Capacity

Redis: Can handle 100K+ writes/sec
S3: No practical limit
Worker: Processes 30K logs/minute (500 batch × 60 minutes)

When to Scale

If queue backup > 5 minutes: Increase worker concurrency
If Redis memory > 80%: Reduce hot log limit or TTL
If S3 costs high: Implement lifecycle policies (move to Glacier)

Phase 2 Migration (ClickHouse)

When query performance becomes important:

Keep current Redis + S3 setup
Add ClickHouse for analytics
Worker writes to both S3 and ClickHouse
Use ClickHouse for dashboard queries
Keep S3 for compliance/backup

Monitoring

Key Metrics to Track

Redis queue depth: LLEN archive:queue:EMAIL_VALIDATION
Redis memory usage: INFO memory
S3 write failures: Check worker logs
Logging errors: Count of caught errors in Promise.catch()

Health Checks

# Check Redis connection
redis-cli -h localhost -p 6379 -n 2 PING

# Check queue size
redis-cli -h localhost -p 6379 -n 2 LLEN archive:queue:EMAIL_VALIDATION

# Check hot logs count for org
redis-cli -h localhost -p 6379 -n 2 ZCARD logs:EMAIL_VALIDATION:{orgId}

Testing

Manual Testing

1. Send validation request (Stamp):

curl -X POST http://localhost:3000/api/v1/validate \
  -H "Content-Type: application/json" \
  -d '{"email": "test@example.com"}'

2. Check Redis (hot logs):

redis-cli -n 2 ZRANGE logs:EMAIL_VALIDATION:demo-org 0 -1

3. Check Redis (queue):

redis-cli -n 2 LLEN archive:queue:EMAIL_VALIDATION

4. Check API (get logs):

curl http://localhost:3000/v1/logs/validation \
  -H "Authorization: Bearer <token>"

5. Wait 1 minute for archiver, then check S3:

aws s3 ls s3://postchi-logs/logs/EMAIL_VALIDATION/demo-org/2026/02/21/ \
  --endpoint-url=https://<account>.r2.cloudflarestorage.com

Troubleshooting

Logs not appearing in Redis

Check Redis connection: redis-cli -n 2 PING
Verify logging initialized: Look for "✅ Logging infrastructure initialized" in logs
Check for errors in application logs

Logs not archiving to S3

Check worker is running: ps aux | grep worker
Check queue has items: redis-cli -n 2 LLEN archive:queue:EMAIL_VALIDATION
Check worker logs for S3 errors
Verify R2 credentials are correct

High Redis memory usage

Check total keys: redis-cli -n 2 DBSIZE
Check largest keys: redis-cli -n 2 --bigkeys
Reduce HOT_LOGS_LIMIT in log-writer.ts (currently 1000)
Reduce TTL (currently 7 days)

Usage counters incorrect

Counters auto-expire (daily: 2 days, monthly: 35 days)
Check if date changed during testing
Manually check Redis: redis-cli -n 2 GET usage:EMAIL_VALIDATION:{orgId}:daily:{YYYYMMDD}

Future Enhancements

Phase 2 (ClickHouse)

Set up ClickHouse cluster
Create tables with proper partitioning
Modify worker to write to both S3 and ClickHouse
Update API to query ClickHouse for analytics
Keep S3 as backup/archive

Features

Log retention policies (auto-delete after X days)
Advanced filtering (by date range, status, email domain)
Export logs to CSV/JSON
Real-time log streaming (WebSockets)
Alerting on unusual patterns
Aggregated analytics dashboard
Cost attribution per organization

Optimizations

Compress logs before S3 write (gzip)
Use Parquet format instead of NDJSON for better compression
Implement log sampling for high-volume orgs
Add rate limiting per organization
Implement quota enforcement in logging layer

Summary

The logging infrastructure is designed to:

Scale to millions of requests/day without performance impact
Keep costs low using tiered storage
Provide fast dashboard access with Redis hot logs
Enable compliance with S3 long-term storage
Allow easy analytics in Phase 2 with ClickHouse

All logging is non-blocking and fault-tolerant - if logging fails, the core application continues to work.

Overview​

Architecture​

Storage Tiers​

Storage Strategy​

Phase 1 vs Phase 2​

Components​

1. Shared Logging Module (@postchi/shared)​

2. Product Types​

3. Log Entry Types​

Email Validation Log​

Email Sending Log​

Implementation Locations​

Stamp (Email Validator)​

Postchi API​

Postchi Worker​

S3/R2 Storage Structure​

API Endpoints​

Get Logs​

Get Usage Statistics​

Environment Variables​

Required for All Services​

Performance Characteristics​

Write Performance​

Read Performance​

Cost Optimization​

Scaling Considerations​

Current Capacity​

When to Scale​

Phase 2 Migration (ClickHouse)​

Monitoring​

Key Metrics to Track​

Health Checks​

Testing​

Manual Testing​

Troubleshooting​

Logs not appearing in Redis​

Logs not archiving to S3​

High Redis memory usage​

Usage counters incorrect​

Future Enhancements​

Phase 2 (ClickHouse)​

Features​

Optimizations​

Summary​

Overview

Architecture

Storage Tiers

Storage Strategy

Phase 1 vs Phase 2

Components

1. Shared Logging Module (`@postchi/shared`)

2. Product Types

3. Log Entry Types

Email Validation Log

Email Sending Log

Implementation Locations

Stamp (Email Validator)

Postchi API

Postchi Worker

S3/R2 Storage Structure

API Endpoints

Get Logs

Get Usage Statistics

Environment Variables

Required for All Services

Performance Characteristics

Write Performance

Read Performance

Cost Optimization

Scaling Considerations

Current Capacity

When to Scale

Phase 2 Migration (ClickHouse)

Monitoring

Key Metrics to Track

Health Checks

Testing

Manual Testing

Troubleshooting

Logs not appearing in Redis

Logs not archiving to S3

High Redis memory usage

Usage counters incorrect

Future Enhancements

Phase 2 (ClickHouse)

Features

Optimizations

Summary