12 KiB

Raw Permalink Blame History

Plutus Payment System - Logging Best Practices

Overview

This document outlines the enhanced logging system implemented in the Plutus Payment Processing System. The logging infrastructure provides comprehensive monitoring, security event tracking, performance analysis, and automated log management.

Logging Architecture

Core Components

Enhanced Logging Configuration (logging_config.py)
- Structured logging with correlation IDs
- Multiple specialized logger types
- Automatic log formatting and rotation
Middleware System (middleware.py)
- Request/response logging
- Performance monitoring
- Security event detection
- Database query tracking
Analytics Dashboard (blueprints/analytics.py)
- Real-time system health monitoring
- Performance metrics visualization
- Security event analysis
- Log search and filtering
Log Retention System (log_retention.py)
- Automated cleanup and archiving
- Configurable retention policies
- Disk space management

Logger Types

StructuredLogger

General-purpose logger with correlation ID support and structured data.

from logging_config import get_logger

logger = get_logger('module_name')
logger.info("Payment processed successfully", 
            payment_id=12345, 
            amount=89.95, 
            customer_id="cus_123")

SecurityLogger

Specialized logger for security events and threats.

from logging_config import security_logger

security_logger.log_login_attempt("username", success=False, ip_address="192.168.1.1")
security_logger.log_payment_fraud_alert(payment_id=123, customer_id="cus_456", 
                                       reason="Unusual amount pattern", amount=5000.0)

PerformanceLogger

Dedicated logger for performance monitoring and optimization.

from logging_config import performance_logger

performance_logger.log_request_time("POST /payments", "POST", 1250.5, 200, user_id=1)
performance_logger.log_stripe_api_call("create_payment", 850.2, True)

Log Files Structure

File Organization

logs/
├── plutus_detailed.log          # Comprehensive application logs
├── performance.log              # Performance metrics and slow operations
├── security.log                 # Security events and threats
├── payment_processing.log       # Payment-specific operations
├── archive/                     # Archived logs by month
│   ├── 202409/
│   └── 202410/
└── *.log.gz                    # Compressed rotated logs

Log Formats

Standard Format

2024-09-02 14:30:15,123 - [corr-abc123] - plutus.payments - INFO - Payment processed successfully {"payment_id": 12345, "amount": 89.95}

Security Format

2024-09-02 14:30:15,123 - SECURITY - [corr-abc123] - WARNING - LOGIN_FAILED for user: testuser {"ip_address": "192.168.1.1", "user_agent": "Mozilla/5.0..."}

Performance Format

2024-09-02 14:30:15,123 - PERF - [corr-abc123] - REQUEST: POST /payments - 1250.50ms - 200 {"user_id": 1, "endpoint": "/payments"}

Correlation IDs

Purpose

Correlation IDs track requests across the entire system, making it easy to trace a single operation through multiple components.

Usage

from logging_config import log_context, set_correlation_id

# Automatic correlation ID
with log_context():
    logger.info("Processing payment")  # Will include auto-generated correlation ID

# Custom correlation ID
with log_context("req-12345"):
    logger.info("Processing payment")  # Will include "req-12345"

# Manual setting
correlation_id = set_correlation_id("custom-id")
logger.info("Payment processed")

Performance Monitoring

Automatic Monitoring

The system automatically tracks:

HTTP request response times
Database query performance
Stripe API call latencies
Slow operations (>1 second requests, >100ms queries)

Manual Performance Logging

from logging_config import log_performance

@log_performance("payment_processing")
def process_payment(payment_data):
    # Function implementation
    pass

# Or manually
start_time = time.time()
result = some_operation()
duration_ms = (time.time() - start_time) * 1000
performance_logger.log_request_time("operation_name", "GET", duration_ms, 200)

Security Event Monitoring

Automatic Detection

The middleware automatically detects and logs:

SQL injection attempts
Cross-site scripting (XSS) attempts
Failed authentication attempts
Suspicious user agents
Access to admin endpoints
Brute force attack patterns

Manual Security Logging

from logging_config import security_logger

# Log permission violations
security_logger.log_permission_denied("username", "delete_payment", "payment/123", "192.168.1.1")

# Log fraud alerts
security_logger.log_payment_fraud_alert(payment_id=123, customer_id="cus_456", 
                                       reason="Multiple failed attempts", amount=1000.0)

Log Retention and Management

Retention Policies

Default retention periods:

Application logs: 30 days
Performance logs: 14 days
Security logs: 90 days
Payment processing logs: 60 days

Automated Cleanup

Runs daily at 2:00 AM
Compresses logs older than configured threshold
Archives important logs before deletion
Monitors disk space usage

Manual Management

from log_retention import retention_manager

# Get statistics
stats = retention_manager.get_log_statistics()

# Manual cleanup
cleanup_stats = retention_manager.cleanup_logs()

# Emergency cleanup (when disk space is low)
emergency_stats = retention_manager.emergency_cleanup(target_size_mb=500)

Analytics Dashboard

Access

Navigate to /analytics/dashboard (requires Finance+ permissions)

Features

System Health: Real-time health score and key metrics
Performance Monitoring: Response times, slow requests, database performance
Payment Analytics: Success rates, error analysis, trends
Security Events: Failed logins, suspicious activity, fraud alerts
Log Search: Full-text search with filtering and pagination

API Endpoints

GET /analytics/api/system-health - Current system health metrics
GET /analytics/api/performance-metrics - Performance analysis data
GET /analytics/api/payment-analytics - Payment processing statistics
GET /analytics/api/security-events - Security event summary
GET /analytics/api/logs/search - Search system logs

Best Practices

For Developers

Use Structured Logging

# Good
logger.info("Payment processed", payment_id=123, amount=89.95, status="success")

# Avoid
logger.info(f"Payment {payment_id} processed for ${amount} - status: {status}")

Include Context

# Include relevant context in all log messages
logger.info("Payment failed", 
            payment_id=payment.id,
            customer_id=payment.customer_id,
            error_code=error.code,
            error_message=str(error))

Use Appropriate Log Levels
- DEBUG: Detailed diagnostic information
- INFO: General information about system operation
- WARNING: Something unexpected happened but system continues
- ERROR: Serious problem that prevented function completion
- CRITICAL: Very serious error that may abort the program

Security-Sensitive Data

# Never log sensitive data
logger.info("Payment processed", 
            payment_id=123,
            amount=89.95,
            card_last4="1234")  # OK - only last 4 digits

# Avoid logging full card numbers, CVV, passwords, etc.

For Operations

Monitor Key Metrics
- System health score (target: >90%)
- Payment success rate (target: >95%)
- Error rate (target: <5%)
- Average response time (target: <1000ms)
Set Up Alerts
- Health score drops below 75%
- Payment success rate drops below 90%
- Multiple security events in short timeframe
- Disk space usage exceeds 80%
Regular Review
- Weekly review of security events
- Monthly analysis of performance trends
- Quarterly review of retention policies
- Annual security audit of logged events

For Security

Monitor for Patterns
- Multiple failed logins from same IP
- Unusual payment amounts or frequencies
- Access attempts to admin endpoints
- SQL injection or XSS attempts
Incident Response
- Use correlation IDs to trace incident across systems
- Export relevant logs for forensic analysis
- Coordinate with development team using structured log data

Configuration

Environment Variables

# Optional: Override default log retention
LOG_RETENTION_DAYS=30
LOG_CLEANUP_TIME=02:00
LOG_MAX_FILE_SIZE_MB=100
LOG_ARCHIVE_COMPRESS=true

Programmatic Configuration

# Custom retention configuration
custom_config = {
    'retention_policies': {
        'security.log': {'days': 180, 'compress_after_days': 7},
        'performance.log': {'days': 7, 'compress_after_days': 1},
        'default': {'days': 30, 'compress_after_days': 7}
    },
    'cleanup_schedule': '03:00',
    'max_file_size_mb': 50
}

retention_manager = LogRetentionManager(custom_config)

Troubleshooting

Common Issues

Logs Not Appearing
- Check logs directory permissions
- Verify logger configuration in app initialization
- Check disk space availability
High Disk Usage
- Run manual cleanup: python log_retention.py
- Reduce retention periods for non-critical logs
- Enable compression for all log types
Performance Impact
- Disable DEBUG level logging in production
- Reduce log verbosity for high-frequency operations
- Use async logging for high-throughput scenarios
Missing Correlation IDs
- Ensure middleware is properly initialized
- Check that log context is being used in threaded operations
- Verify correlation ID propagation in external API calls

Log Analysis Commands

# Search for specific payment
grep "payment_id.*12345" logs/plutus_detailed.log

# Find all errors in last hour
grep "$(date -d '1 hour ago' '+%Y-%m-%d %H')" logs/plutus_detailed.log | grep ERROR

# Count security events by type
grep "SECURITY" logs/security.log | cut -d'-' -f5 | sort | uniq -c

# Monitor real-time logs
tail -f logs/plutus_detailed.log

# Analyze correlation ID flow
grep "corr-abc123" logs/*.log | sort

Support and Maintenance

Log File Monitoring

Set up monitoring for:

Log file growth rates
Error frequency patterns
Security event trends
System performance degradation

Regular Maintenance

Weekly: Review disk space and cleanup if needed
Monthly: Analyze performance trends and optimize slow queries
Quarterly: Review retention policies and adjust as needed
Annually: Audit security events and update detection rules

Contact Information

For logging system issues or questions:

Development Team: Review code in logging_config.py, middleware.py
Operations Team: Monitor analytics dashboard and system health
Security Team: Review security logs and event patterns

Version History

v1.0 (Phase 8): Initial enhanced logging implementation
v1.1 (Phase 9): Analytics dashboard and retention system
v1.2: Correlation ID improvements and performance optimization

This logging system provides comprehensive visibility into the Plutus Payment System while maintaining security, performance, and operational efficiency. Regular review and maintenance of the logging infrastructure ensures continued reliability and usefulness for system monitoring and troubleshooting.

12 KiB Raw Permalink Blame History