12 KiB
Plutus Payment System - Logging Best Practices
Overview
This document outlines the enhanced logging system implemented in the Plutus Payment Processing System. The logging infrastructure provides comprehensive monitoring, security event tracking, performance analysis, and automated log management.
Logging Architecture
Core Components
-
Enhanced Logging Configuration (
logging_config.py)- Structured logging with correlation IDs
- Multiple specialized logger types
- Automatic log formatting and rotation
-
Middleware System (
middleware.py)- Request/response logging
- Performance monitoring
- Security event detection
- Database query tracking
-
Analytics Dashboard (
blueprints/analytics.py)- Real-time system health monitoring
- Performance metrics visualization
- Security event analysis
- Log search and filtering
-
Log Retention System (
log_retention.py)- Automated cleanup and archiving
- Configurable retention policies
- Disk space management
Logger Types
StructuredLogger
General-purpose logger with correlation ID support and structured data.
from logging_config import get_logger
logger = get_logger('module_name')
logger.info("Payment processed successfully",
payment_id=12345,
amount=89.95,
customer_id="cus_123")
SecurityLogger
Specialized logger for security events and threats.
from logging_config import security_logger
security_logger.log_login_attempt("username", success=False, ip_address="192.168.1.1")
security_logger.log_payment_fraud_alert(payment_id=123, customer_id="cus_456",
reason="Unusual amount pattern", amount=5000.0)
PerformanceLogger
Dedicated logger for performance monitoring and optimization.
from logging_config import performance_logger
performance_logger.log_request_time("POST /payments", "POST", 1250.5, 200, user_id=1)
performance_logger.log_stripe_api_call("create_payment", 850.2, True)
Log Files Structure
File Organization
logs/
├── plutus_detailed.log # Comprehensive application logs
├── performance.log # Performance metrics and slow operations
├── security.log # Security events and threats
├── payment_processing.log # Payment-specific operations
├── archive/ # Archived logs by month
│ ├── 202409/
│ └── 202410/
└── *.log.gz # Compressed rotated logs
Log Formats
Standard Format
2024-09-02 14:30:15,123 - [corr-abc123] - plutus.payments - INFO - Payment processed successfully {"payment_id": 12345, "amount": 89.95}
Security Format
2024-09-02 14:30:15,123 - SECURITY - [corr-abc123] - WARNING - LOGIN_FAILED for user: testuser {"ip_address": "192.168.1.1", "user_agent": "Mozilla/5.0..."}
Performance Format
2024-09-02 14:30:15,123 - PERF - [corr-abc123] - REQUEST: POST /payments - 1250.50ms - 200 {"user_id": 1, "endpoint": "/payments"}
Correlation IDs
Purpose
Correlation IDs track requests across the entire system, making it easy to trace a single operation through multiple components.
Usage
from logging_config import log_context, set_correlation_id
# Automatic correlation ID
with log_context():
logger.info("Processing payment") # Will include auto-generated correlation ID
# Custom correlation ID
with log_context("req-12345"):
logger.info("Processing payment") # Will include "req-12345"
# Manual setting
correlation_id = set_correlation_id("custom-id")
logger.info("Payment processed")
Performance Monitoring
Automatic Monitoring
The system automatically tracks:
- HTTP request response times
- Database query performance
- Stripe API call latencies
- Slow operations (>1 second requests, >100ms queries)
Manual Performance Logging
from logging_config import log_performance
@log_performance("payment_processing")
def process_payment(payment_data):
# Function implementation
pass
# Or manually
start_time = time.time()
result = some_operation()
duration_ms = (time.time() - start_time) * 1000
performance_logger.log_request_time("operation_name", "GET", duration_ms, 200)
Security Event Monitoring
Automatic Detection
The middleware automatically detects and logs:
- SQL injection attempts
- Cross-site scripting (XSS) attempts
- Failed authentication attempts
- Suspicious user agents
- Access to admin endpoints
- Brute force attack patterns
Manual Security Logging
from logging_config import security_logger
# Log permission violations
security_logger.log_permission_denied("username", "delete_payment", "payment/123", "192.168.1.1")
# Log fraud alerts
security_logger.log_payment_fraud_alert(payment_id=123, customer_id="cus_456",
reason="Multiple failed attempts", amount=1000.0)
Log Retention and Management
Retention Policies
Default retention periods:
- Application logs: 30 days
- Performance logs: 14 days
- Security logs: 90 days
- Payment processing logs: 60 days
Automated Cleanup
- Runs daily at 2:00 AM
- Compresses logs older than configured threshold
- Archives important logs before deletion
- Monitors disk space usage
Manual Management
from log_retention import retention_manager
# Get statistics
stats = retention_manager.get_log_statistics()
# Manual cleanup
cleanup_stats = retention_manager.cleanup_logs()
# Emergency cleanup (when disk space is low)
emergency_stats = retention_manager.emergency_cleanup(target_size_mb=500)
Analytics Dashboard
Access
Navigate to /analytics/dashboard (requires Finance+ permissions)
Features
- System Health: Real-time health score and key metrics
- Performance Monitoring: Response times, slow requests, database performance
- Payment Analytics: Success rates, error analysis, trends
- Security Events: Failed logins, suspicious activity, fraud alerts
- Log Search: Full-text search with filtering and pagination
API Endpoints
GET /analytics/api/system-health- Current system health metricsGET /analytics/api/performance-metrics- Performance analysis dataGET /analytics/api/payment-analytics- Payment processing statisticsGET /analytics/api/security-events- Security event summaryGET /analytics/api/logs/search- Search system logs
Best Practices
For Developers
-
Use Structured Logging
# Good logger.info("Payment processed", payment_id=123, amount=89.95, status="success") # Avoid logger.info(f"Payment {payment_id} processed for ${amount} - status: {status}") -
Include Context
# Include relevant context in all log messages logger.info("Payment failed", payment_id=payment.id, customer_id=payment.customer_id, error_code=error.code, error_message=str(error)) -
Use Appropriate Log Levels
DEBUG: Detailed diagnostic informationINFO: General information about system operationWARNING: Something unexpected happened but system continuesERROR: Serious problem that prevented function completionCRITICAL: Very serious error that may abort the program
-
Security-Sensitive Data
# Never log sensitive data logger.info("Payment processed", payment_id=123, amount=89.95, card_last4="1234") # OK - only last 4 digits # Avoid logging full card numbers, CVV, passwords, etc.
For Operations
-
Monitor Key Metrics
- System health score (target: >90%)
- Payment success rate (target: >95%)
- Error rate (target: <5%)
- Average response time (target: <1000ms)
-
Set Up Alerts
- Health score drops below 75%
- Payment success rate drops below 90%
- Multiple security events in short timeframe
- Disk space usage exceeds 80%
-
Regular Review
- Weekly review of security events
- Monthly analysis of performance trends
- Quarterly review of retention policies
- Annual security audit of logged events
For Security
-
Monitor for Patterns
- Multiple failed logins from same IP
- Unusual payment amounts or frequencies
- Access attempts to admin endpoints
- SQL injection or XSS attempts
-
Incident Response
- Use correlation IDs to trace incident across systems
- Export relevant logs for forensic analysis
- Coordinate with development team using structured log data
Configuration
Environment Variables
# Optional: Override default log retention
LOG_RETENTION_DAYS=30
LOG_CLEANUP_TIME=02:00
LOG_MAX_FILE_SIZE_MB=100
LOG_ARCHIVE_COMPRESS=true
Programmatic Configuration
# Custom retention configuration
custom_config = {
'retention_policies': {
'security.log': {'days': 180, 'compress_after_days': 7},
'performance.log': {'days': 7, 'compress_after_days': 1},
'default': {'days': 30, 'compress_after_days': 7}
},
'cleanup_schedule': '03:00',
'max_file_size_mb': 50
}
retention_manager = LogRetentionManager(custom_config)
Troubleshooting
Common Issues
-
Logs Not Appearing
- Check logs directory permissions
- Verify logger configuration in app initialization
- Check disk space availability
-
High Disk Usage
- Run manual cleanup:
python log_retention.py - Reduce retention periods for non-critical logs
- Enable compression for all log types
- Run manual cleanup:
-
Performance Impact
- Disable DEBUG level logging in production
- Reduce log verbosity for high-frequency operations
- Use async logging for high-throughput scenarios
-
Missing Correlation IDs
- Ensure middleware is properly initialized
- Check that log context is being used in threaded operations
- Verify correlation ID propagation in external API calls
Log Analysis Commands
# Search for specific payment
grep "payment_id.*12345" logs/plutus_detailed.log
# Find all errors in last hour
grep "$(date -d '1 hour ago' '+%Y-%m-%d %H')" logs/plutus_detailed.log | grep ERROR
# Count security events by type
grep "SECURITY" logs/security.log | cut -d'-' -f5 | sort | uniq -c
# Monitor real-time logs
tail -f logs/plutus_detailed.log
# Analyze correlation ID flow
grep "corr-abc123" logs/*.log | sort
Support and Maintenance
Log File Monitoring
Set up monitoring for:
- Log file growth rates
- Error frequency patterns
- Security event trends
- System performance degradation
Regular Maintenance
- Weekly: Review disk space and cleanup if needed
- Monthly: Analyze performance trends and optimize slow queries
- Quarterly: Review retention policies and adjust as needed
- Annually: Audit security events and update detection rules
Contact Information
For logging system issues or questions:
- Development Team: Review code in
logging_config.py,middleware.py - Operations Team: Monitor analytics dashboard and system health
- Security Team: Review security logs and event patterns
Version History
- v1.0 (Phase 8): Initial enhanced logging implementation
- v1.1 (Phase 9): Analytics dashboard and retention system
- v1.2: Correlation ID improvements and performance optimization
This logging system provides comprehensive visibility into the Plutus Payment System while maintaining security, performance, and operational efficiency. Regular review and maintenance of the logging infrastructure ensures continued reliability and usefulness for system monitoring and troubleshooting.