You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 

12 KiB

Plutus Payment System - Logging Best Practices

Overview

This document outlines the enhanced logging system implemented in the Plutus Payment Processing System. The logging infrastructure provides comprehensive monitoring, security event tracking, performance analysis, and automated log management.

Logging Architecture

Core Components

  1. Enhanced Logging Configuration (logging_config.py)

    • Structured logging with correlation IDs
    • Multiple specialized logger types
    • Automatic log formatting and rotation
  2. Middleware System (middleware.py)

    • Request/response logging
    • Performance monitoring
    • Security event detection
    • Database query tracking
  3. Analytics Dashboard (blueprints/analytics.py)

    • Real-time system health monitoring
    • Performance metrics visualization
    • Security event analysis
    • Log search and filtering
  4. Log Retention System (log_retention.py)

    • Automated cleanup and archiving
    • Configurable retention policies
    • Disk space management

Logger Types

StructuredLogger

General-purpose logger with correlation ID support and structured data.

from logging_config import get_logger

logger = get_logger('module_name')
logger.info("Payment processed successfully", 
            payment_id=12345, 
            amount=89.95, 
            customer_id="cus_123")

SecurityLogger

Specialized logger for security events and threats.

from logging_config import security_logger

security_logger.log_login_attempt("username", success=False, ip_address="192.168.1.1")
security_logger.log_payment_fraud_alert(payment_id=123, customer_id="cus_456", 
                                       reason="Unusual amount pattern", amount=5000.0)

PerformanceLogger

Dedicated logger for performance monitoring and optimization.

from logging_config import performance_logger

performance_logger.log_request_time("POST /payments", "POST", 1250.5, 200, user_id=1)
performance_logger.log_stripe_api_call("create_payment", 850.2, True)

Log Files Structure

File Organization

logs/
├── plutus_detailed.log          # Comprehensive application logs
├── performance.log              # Performance metrics and slow operations
├── security.log                 # Security events and threats
├── payment_processing.log       # Payment-specific operations
├── archive/                     # Archived logs by month
│   ├── 202409/
│   └── 202410/
└── *.log.gz                    # Compressed rotated logs

Log Formats

Standard Format

2024-09-02 14:30:15,123 - [corr-abc123] - plutus.payments - INFO - Payment processed successfully {"payment_id": 12345, "amount": 89.95}

Security Format

2024-09-02 14:30:15,123 - SECURITY - [corr-abc123] - WARNING - LOGIN_FAILED for user: testuser {"ip_address": "192.168.1.1", "user_agent": "Mozilla/5.0..."}

Performance Format

2024-09-02 14:30:15,123 - PERF - [corr-abc123] - REQUEST: POST /payments - 1250.50ms - 200 {"user_id": 1, "endpoint": "/payments"}

Correlation IDs

Purpose

Correlation IDs track requests across the entire system, making it easy to trace a single operation through multiple components.

Usage

from logging_config import log_context, set_correlation_id

# Automatic correlation ID
with log_context():
    logger.info("Processing payment")  # Will include auto-generated correlation ID

# Custom correlation ID
with log_context("req-12345"):
    logger.info("Processing payment")  # Will include "req-12345"

# Manual setting
correlation_id = set_correlation_id("custom-id")
logger.info("Payment processed")

Performance Monitoring

Automatic Monitoring

The system automatically tracks:

  • HTTP request response times
  • Database query performance
  • Stripe API call latencies
  • Slow operations (>1 second requests, >100ms queries)

Manual Performance Logging

from logging_config import log_performance

@log_performance("payment_processing")
def process_payment(payment_data):
    # Function implementation
    pass

# Or manually
start_time = time.time()
result = some_operation()
duration_ms = (time.time() - start_time) * 1000
performance_logger.log_request_time("operation_name", "GET", duration_ms, 200)

Security Event Monitoring

Automatic Detection

The middleware automatically detects and logs:

  • SQL injection attempts
  • Cross-site scripting (XSS) attempts
  • Failed authentication attempts
  • Suspicious user agents
  • Access to admin endpoints
  • Brute force attack patterns

Manual Security Logging

from logging_config import security_logger

# Log permission violations
security_logger.log_permission_denied("username", "delete_payment", "payment/123", "192.168.1.1")

# Log fraud alerts
security_logger.log_payment_fraud_alert(payment_id=123, customer_id="cus_456", 
                                       reason="Multiple failed attempts", amount=1000.0)

Log Retention and Management

Retention Policies

Default retention periods:

  • Application logs: 30 days
  • Performance logs: 14 days
  • Security logs: 90 days
  • Payment processing logs: 60 days

Automated Cleanup

  • Runs daily at 2:00 AM
  • Compresses logs older than configured threshold
  • Archives important logs before deletion
  • Monitors disk space usage

Manual Management

from log_retention import retention_manager

# Get statistics
stats = retention_manager.get_log_statistics()

# Manual cleanup
cleanup_stats = retention_manager.cleanup_logs()

# Emergency cleanup (when disk space is low)
emergency_stats = retention_manager.emergency_cleanup(target_size_mb=500)

Analytics Dashboard

Access

Navigate to /analytics/dashboard (requires Finance+ permissions)

Features

  • System Health: Real-time health score and key metrics
  • Performance Monitoring: Response times, slow requests, database performance
  • Payment Analytics: Success rates, error analysis, trends
  • Security Events: Failed logins, suspicious activity, fraud alerts
  • Log Search: Full-text search with filtering and pagination

API Endpoints

  • GET /analytics/api/system-health - Current system health metrics
  • GET /analytics/api/performance-metrics - Performance analysis data
  • GET /analytics/api/payment-analytics - Payment processing statistics
  • GET /analytics/api/security-events - Security event summary
  • GET /analytics/api/logs/search - Search system logs

Best Practices

For Developers

  1. Use Structured Logging

    # Good
    logger.info("Payment processed", payment_id=123, amount=89.95, status="success")
    
    # Avoid
    logger.info(f"Payment {payment_id} processed for ${amount} - status: {status}")
    
  2. Include Context

    # Include relevant context in all log messages
    logger.info("Payment failed", 
                payment_id=payment.id,
                customer_id=payment.customer_id,
                error_code=error.code,
                error_message=str(error))
    
  3. Use Appropriate Log Levels

    • DEBUG: Detailed diagnostic information
    • INFO: General information about system operation
    • WARNING: Something unexpected happened but system continues
    • ERROR: Serious problem that prevented function completion
    • CRITICAL: Very serious error that may abort the program
  4. Security-Sensitive Data

    # Never log sensitive data
    logger.info("Payment processed", 
                payment_id=123,
                amount=89.95,
                card_last4="1234")  # OK - only last 4 digits
    
    # Avoid logging full card numbers, CVV, passwords, etc.
    

For Operations

  1. Monitor Key Metrics

    • System health score (target: >90%)
    • Payment success rate (target: >95%)
    • Error rate (target: <5%)
    • Average response time (target: <1000ms)
  2. Set Up Alerts

    • Health score drops below 75%
    • Payment success rate drops below 90%
    • Multiple security events in short timeframe
    • Disk space usage exceeds 80%
  3. Regular Review

    • Weekly review of security events
    • Monthly analysis of performance trends
    • Quarterly review of retention policies
    • Annual security audit of logged events

For Security

  1. Monitor for Patterns

    • Multiple failed logins from same IP
    • Unusual payment amounts or frequencies
    • Access attempts to admin endpoints
    • SQL injection or XSS attempts
  2. Incident Response

    • Use correlation IDs to trace incident across systems
    • Export relevant logs for forensic analysis
    • Coordinate with development team using structured log data

Configuration

Environment Variables

# Optional: Override default log retention
LOG_RETENTION_DAYS=30
LOG_CLEANUP_TIME=02:00
LOG_MAX_FILE_SIZE_MB=100
LOG_ARCHIVE_COMPRESS=true

Programmatic Configuration

# Custom retention configuration
custom_config = {
    'retention_policies': {
        'security.log': {'days': 180, 'compress_after_days': 7},
        'performance.log': {'days': 7, 'compress_after_days': 1},
        'default': {'days': 30, 'compress_after_days': 7}
    },
    'cleanup_schedule': '03:00',
    'max_file_size_mb': 50
}

retention_manager = LogRetentionManager(custom_config)

Troubleshooting

Common Issues

  1. Logs Not Appearing

    • Check logs directory permissions
    • Verify logger configuration in app initialization
    • Check disk space availability
  2. High Disk Usage

    • Run manual cleanup: python log_retention.py
    • Reduce retention periods for non-critical logs
    • Enable compression for all log types
  3. Performance Impact

    • Disable DEBUG level logging in production
    • Reduce log verbosity for high-frequency operations
    • Use async logging for high-throughput scenarios
  4. Missing Correlation IDs

    • Ensure middleware is properly initialized
    • Check that log context is being used in threaded operations
    • Verify correlation ID propagation in external API calls

Log Analysis Commands

# Search for specific payment
grep "payment_id.*12345" logs/plutus_detailed.log

# Find all errors in last hour
grep "$(date -d '1 hour ago' '+%Y-%m-%d %H')" logs/plutus_detailed.log | grep ERROR

# Count security events by type
grep "SECURITY" logs/security.log | cut -d'-' -f5 | sort | uniq -c

# Monitor real-time logs
tail -f logs/plutus_detailed.log

# Analyze correlation ID flow
grep "corr-abc123" logs/*.log | sort

Support and Maintenance

Log File Monitoring

Set up monitoring for:

  • Log file growth rates
  • Error frequency patterns
  • Security event trends
  • System performance degradation

Regular Maintenance

  • Weekly: Review disk space and cleanup if needed
  • Monthly: Analyze performance trends and optimize slow queries
  • Quarterly: Review retention policies and adjust as needed
  • Annually: Audit security events and update detection rules

Contact Information

For logging system issues or questions:

  • Development Team: Review code in logging_config.py, middleware.py
  • Operations Team: Monitor analytics dashboard and system health
  • Security Team: Review security logs and event patterns

Version History

  • v1.0 (Phase 8): Initial enhanced logging implementation
  • v1.1 (Phase 9): Analytics dashboard and retention system
  • v1.2: Correlation ID improvements and performance optimization

This logging system provides comprehensive visibility into the Plutus Payment System while maintaining security, performance, and operational efficiency. Regular review and maintenance of the logging infrastructure ensures continued reliability and usefulness for system monitoring and troubleshooting.