Documentation

πŸ—οΈ System Architecture

Overview

Smart Mobility Predictor follows a modular, layered architecture designed for scalability, maintainability, and performance.


Architecture Layers

1. Data Ingestion Layer

Responsibility: Collect data from multiple sources

Components:

  • Traffic API Client - Real-time sensor feeds, historical data
  • Weather Service - Current & forecast weather data
  • Event Calendar - Scheduled events, holidays
  • Web Scraper - Public transportation schedules
  • User Feedback System - Ratings & actual vs. predicted data

Technologies:

  • Python requests library
  • Async processing (asyncio, aiohttp)
  • Message queue (RabbitMQ) for event streaming

Data Flow:

[APIs] β†’ [Rate Limiter] β†’ [Validation] β†’ [Message Queue] β†’ [Data Pipeline]

2. Data Processing Pipeline

Responsibility: Transform raw data into ML-ready features

Steps:

  1. Data Loading

    • Fetch from databases & message queues
    • Concurrent processing for performance
    • Memory-efficient streaming for large datasets
  2. Data Cleaning

    • Remove duplicates & outliers
    • Handle missing values (KNN imputation for temporal)
    • Validate data ranges
  3. Feature Engineering

    • Temporal features (hour, day, holiday flag)
    • Lag features (traffic 5, 15, 60 min ago)
    • Rolling aggregates (hourly, daily, weekly)
    • Distance decay & proximity features
    • Weather interaction terms
    • Network features (degree centrality)
  4. Data Normalization

    • Min-Max scaling for bounded features (0-100%)
    • Standard scaling (Z-score) for unbounded
    • Separate scalers to prevent data leakage
  5. Output

    • Training data (60%)
    • Validation data (20%)
    • Test data (20%)

Technologies:

  • Pandas, NumPy
  • Scikit-learn preprocessing
  • Apache Spark (for distributed processing at scale)

3. Machine Learning Layer

Responsibility: Generate predictions & recommendations

Models Deployed:

ModelPurposeInputOutputLatency
XGBoost EnsembleTravel timeDistance, congestion, weather, timeMinutes Β±CI50ms
Random ForestCongestion classTraffic, weather, eventsLow/Med/High30ms
K-MeansZone clustersAggregated patternsCluster ID10ms
Recommendation EngineRoute rankingAll above + user prefsRanked routes100ms

Model Serving:

  • Format: ONNX, joblib, pickle
  • Storage: Model registry (MLflow), S3/GCS
  • Caching: Redis for fast inference
  • Versioning: Semantic versioning (v1.0, v1.1, etc.)

4. API Layer

Responsibility: Expose ML predictions via HTTP/REST

Endpoints:

POST /api/v1/predict/travel-time
  Input: {origin, destination, departure_time, weather}
  Output: {time_min, confidence, factors}

POST /api/v1/predict/congestion
  Input: {location, time, weather}
  Output: {class, probability}

POST /api/v1/recommendations
  Input: {origin, destination, preferences}
  Output: {routes: [{rank, time, risk, score}]}

GET /api/v1/traffic/real-time
  Input: {bbox: [lat_min, lon_min, lat_max, lon_max]}
  Output: {segments: [{congestion, speed, timestamp}]}

Technologies:

  • FastAPI or Flask
  • Request validation (Pydantic)
  • Rate limiting (Redis)
  • Error handling & logging
  • OpenAPI documentation

5. Application Layer

Responsibility: Interactive user interface

Components:

Web Dashboard (Streamlit/React):

  • Input form (origin, destination, time, preferences)
  • Interactive map with route visualization
  • Recommendation cards with details
  • Real-time traffic heatmap
  • Historical analytics

Features:

  • Responsive design (mobile, tablet, desktop)
  • Dark/light mode
  • Accessibility (WCAG 2.1 AA)
  • Progressive loading
  • Offline map support (PWA)

6. Data Storage Layer

Databases:

DatabaseUse CaseDataCapacity
PostgreSQLTransactional dataUsers, trips, feedback100 GB
TimescaleDBTime-series trafficSensor readings, predictions5 TB
RedisCaching & sessionsModel cache, user sessions50 GB
ElasticsearchSearch & loggingAPI logs, errors500 GB

Schema Example (PostgreSQL):

sql
CREATE TABLE trips ( id UUID PRIMARY KEY, user_id UUID REFERENCES users(id), origin POINT, destination POINT, departure_time TIMESTAMP, transport_mode VARCHAR, predicted_time INT, actual_time INT, feedback FLOAT, created_at TIMESTAMP DEFAULT NOW() ); CREATE INDEX ON trips(created_at, user_id);

Deployment Architecture

Containerization

Docker Images:

  1. API Service

    • Base: python:3.10-slim
    • Runtime: FastAPI
    • Size: ~200MB
  2. Web Dashboard

    • Base: python:3.10-slim
    • Runtime: Streamlit
    • Size: ~300MB
  3. Worker Service

    • Scheduled jobs (model retraining, data aggregation)
    • Celery + RabbitMQ
    • Size: ~200MB

Kubernetes Orchestration

Deployment Structure:

yaml
# Namespace for isolation apiVersion: v1 kind: Namespace metadata: name: smart-mobility --- # API Service Deployment apiVersion: apps/v1 kind: Deployment metadata: name: api namespace: smart-mobility spec: replicas: 3 selector: matchLabels: app: api template: metadata: labels: app: api spec: containers: - name: api image: gcr.io/project/api:v1.0 ports: - containerPort: 5000 resources: requests: cpu: 500m memory: 512Mi limits: cpu: 1000m memory: 1Gi env: - name: DATABASE_URL valueFrom: secretKeyRef: name: db-secrets key: url livenessProbe: httpGet: path: /health port: 5000 initialDelaySeconds: 30 periodSeconds: 10 --- # Service exposure apiVersion: v1 kind: Service metadata: name: api-service namespace: smart-mobility spec: type: LoadBalancer selector: app: api ports: - protocol: TCP port: 80 targetPort: 5000

Load Balancing & Scaling

Horizontal Pod Autoscaler:

yaml
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: api-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: api minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80

Data Flow Diagram

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         DATA SOURCES (Real-time & Batch)           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Traffic APIs β”‚ Weather APIs β”‚ Events β”‚ GPS Traces β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β–Ό
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚ Data Ingestionβ”‚
          β”‚  - Validation β”‚
          β”‚  - Dedup      β”‚
          β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                 β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚ Message Queue    β”‚
         β”‚ (RabbitMQ/Kafka) β”‚
         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚ Feature Engineering β”‚
        β”‚ - Cleaning          β”‚
        β”‚ - Scaling           β”‚
        β”‚ - Feature Creation  β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β–Ό
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚ Feature Storeβ”‚
          β”‚ (Redis)      β”‚
          β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                 β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚    ML Model Inference      β”‚
    β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
    β”‚ - Travel time (XGBoost)    β”‚
    β”‚ - Congestion (RF)          β”‚
    β”‚ - Clustering (K-Means)     β”‚
    β”‚ - Recommendations (Hybrid) β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚ API Responsesβ”‚
         β”‚ (FastAPI)    β”‚
         β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β–Ό
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
     β”‚ Frontend Display β”‚
     β”‚ (Streamlit/React)β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Performance Optimization

Caching Strategy

Multi-Level Cache:

  1. L1: In-Memory (Application)

    • TTL: 5 minutes
    • Capacity: 1GB
    • Content: Recent predictions, model parameters
  2. L2: Redis (Distributed)

    • TTL: 15 minutes
    • Capacity: 50GB
    • Content: Route recommendations, traffic snapshots
  3. L3: Database Cache (PostgreSQL)

    • Permanent storage
    • Content: Historical predictions for analytics

Cache Invalidation:

  • Time-based: Refresh every 15 min
  • Event-based: Invalidate on accident/incident
  • Manual: Force refresh for testing

Query Optimization

Database Indexing:

sql
-- Time-series queries CREATE INDEX idx_traffic_time ON traffic_readings(timestamp DESC); -- Geospatial queries CREATE INDEX idx_location ON traffic_readings USING GIST(location); -- User queries CREATE INDEX idx_user_trips ON trips(user_id, created_at DESC);

Connection Pooling:

  • PgBouncer for PostgreSQL
  • Pool size: 20-50 connections
  • Timeout: 30 seconds

API Performance

Response Times (Target):

  • Travel time prediction: < 50ms
  • Congestion classification: < 30ms
  • Route recommendations: < 100ms
  • All requests: P95 < 500ms

Optimization Techniques:

  • Request batching (multiple predictions in one call)
  • Async processing for I/O operations
  • Compression (gzip) for large responses
  • CDN caching for static assets

Security Architecture

Authentication & Authorization

API Security:

  • API keys for service-to-service calls
  • JWT tokens for user sessions
  • Role-based access control (RBAC)
  • Rate limiting: 100 req/min per user, 10K/min per key

Data Security:

  • Encryption at rest (AES-256)
  • Encryption in transit (TLS 1.3)
  • Database: Column-level encryption for PII
  • Secrets management: HashiCorp Vault

Network Security

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ CDN (Cloudflare)                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ WAF (Web Application Firewall)       β”‚
β”‚ - DDoS protection                   β”‚
β”‚ - SQL injection blocking             β”‚
β”‚ - Rate limiting                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Load Balancer (TLS Termination)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Kubernetes Pod Security Policies     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Network Policies (Egress/Ingress)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Monitoring & Logging

Observability Stack

ToolPurposeMetrics
PrometheusMetrics collectionCPU, memory, requests
GrafanaVisualizationDashboards & alerts
ELK StackLog aggregationErrors, performance
JaegerDistributed tracingRequest flow, latency

Key Metrics to Monitor

System Health:

  • API latency (P50, P95, P99)
  • Error rate (4xx, 5xx)
  • Pod restart count
  • Database connection pool usage

Model Performance:

  • Prediction accuracy (MAE, RMSE)
  • Prediction drift detection
  • Inference latency distribution
  • Cache hit ratio

Business Metrics:

  • User requests per second
  • Route recommendation adoption rate
  • User satisfaction (ratings)
  • Cost per prediction

Disaster Recovery

Backup Strategy:

  • Database: Daily full backups, hourly incremental
  • Models: Version-controlled in S3, Git
  • Code: GitHub with branch protection
  • Configuration: Terraform state stored in GCS

Recovery Time Objectives (RTO):

  • Complete system failure: 1 hour
  • Database failure: 15 minutes
  • Model failure: 5 minutes (fallback model)

Scalability

Horizontal Scaling:

  • Stateless API services (unlimited)
  • Database replicas for read scaling
  • Caching layer for hot data

Vertical Scaling:

  • Increase instance size for memory-intensive jobs
  • GPU nodes for model training

Expected Scale:

  • 10M users
  • 1M requests/hour
  • 100 predictions/second sustained
  • 500 predictions/second burst

Cost Optimization

Resource Management:

  • Spot instances for batch jobs (70% savings)
  • Reserved instances for baseline (40% savings)
  • Autoscaling to match demand (30% savings)

Estimated Monthly Costs:

  • Compute: $5,000
  • Storage: $1,000
  • Data transfer: $2,000
  • Total: $8,000

Cost Reduction Path:

  1. Implement caching (15% reduction)
  2. Use spot instances (20% reduction)
  3. Optimize queries (10% reduction)
  4. Total potential savings: 45%