Executive Summary This document outlines a proposed AI-powered vehicle tracking and identification system designed to enhance law enforcement capabilities in two critical scenarios: (1) Amber Alerts involving child abduction cases, and (2) incidents involving vehicles used in crimes, including high-speed pursuits. The system combines edge-based license plate recognition (LPR), vehicle feature identification (make, model, color), and centralized AI coordination across city-wide infrastructure to enable accurate, real-time tracking.
1. Introduction Law enforcement agencies face growing challenges in responding to Amber Alerts and locating vehicles involved in criminal activity. Traditional methods are often reactive, relying on radio reports, eyewitness accounts, or incomplete surveillance. High-speed pursuits endanger both officers and civilians, and obscured or missing license plates can render many systems ineffective.
This system leverages real-time AI on edge devices and centralized cloud coordination to overcome these obstacles. It enables authorities to identify, track, and locate suspect vehicles throughout a geographic region with minimal manual intervention.
2. System Overview The solution consists of two major components:
A. Edge-Based Vehicle Recognition Units:
- Mounted on police vehicles and public infrastructure (traffic lights, overpasses, parking lots)
- AI models trained for license plate recognition (YOLOv5 + EasyOCR)
- Vehicle classification models to identify make, model, and color
- GPS-tagged detection logs for location-based tracking
- Sends minimal data (plate ID, vehicle features, timestamp, GPS) to central server
B. Centralized Command & Control System:
- Ingests detection logs via secure, real-time messaging (e.g., Amazon Kinesis, MQTT)
- Matches results against police databases and Amber Alert bulletins
- Maintains vehicle state and trajectory across regions
- Alerts patrol units when suspect vehicles are reacquired by infrastructure cameras
- Requests video feed only when actionable (e.g., vehicle reacquired)
3. Use Case 1: Amber Alert Tracking
In an Amber Alert, time is critical. The suspect vehicle may be reported with only partial information—a partial plate, vehicle make, or color. This system enables the following:
- All edge cameras scan and report vehicles matching the alert’s description
- Even if the plate is missing or obstructed, the system can match based on model and color
- As the vehicle moves, detections are chained across the region, mapping the route
- Law enforcement is notified at key checkpoints, improving interception chances
- Video evidence can be pulled only after confirmation to conserve bandwidth
4. Use Case 2: Crime Vehicle Pursuit and Region-Wide Tracking
In criminal activity (e.g., robbery, assault), suspects often flee in a vehicle. Rather than engaging in a dangerous high-speed chase:
- The officer records the fleeing vehicle using dashcam or mobile device
- Edge system in the car recognizes plate (if visible) and vehicle features
- Tracking mode is triggered: all city infrastructure is alerted to watch for the vehicle
- As cameras reacquire the suspect vehicle, real-time GPS is sent to the command center
- Officers are directed to intercept at low-risk points or to set up roadblocks
- Eliminates high-speed chases, reducing injury risk and increasing apprehension rates
5. Handling Obscured or Missing Plates
License plates may be deliberately covered, damaged, or otherwise unreadable. The system addresses this with:
- Vehicle shape and feature detection using CNN-based classifiers
- Color histograms and model identification (e.g., “White Toyota Camry”)
- Temporal tracking: matching trajectory and appearance across cameras
- Multi-modal tracking using both LPR and vehicle signature vectors
6. Privacy and Security Considerations
- Only minimal data is transmitted: no PII unless match is found in law enforcement database
- Full video feed is requested only when a match is confirmed
- End-to-end encryption of all detection logs
- Full audit logs and access control for all video and data retrieval
7. Advantages
- Faster response to Amber Alerts with automated vehicle tracing
- Reduced reliance on high-speed pursuit, improving public and officer safety
- Higher apprehension rates for fleeing suspects
- Less bandwidth usage via selective video stream access
- Works with both license plate and appearance-based recognition
- Modular, scalable architecture using edge AI and cloud orchestration
8. Conclusion The proposed system transforms law enforcement vehicle detection from a passive, manual process to a proactive, intelligent network capable of tracking suspect vehicles across entire cities. Whether rescuing a child in danger or intercepting criminals safely, the system enables a new level of efficiency and safety for public servants and citizens alike.
Component Architecture Breakdown
- Edge Detection Device (In Police Vehicles and Traffic Cameras)
- Components:
- Video camera module (vehicle- or infrastructure-mounted)
- Jetson-based processor running YOLOv5 model
- GPS module for precise location
- Software stack: PyTorch, YOLOv5 model (for license plate and vehicle detection), EasyOCR, RapidFuzz for fuzzy matching
- Functions:
- Capture real-time video
- Detect license plates, vehicle shape, make, color
- Perform OCR and match against watchlists
- Send minimal metadata (plate number, GPS, frame) to Central System
- Components:
- Centralized Tracking System (Cloud-hosted)
- Components:
- Cloud server or Kubernetes-hosted REST API
- License plate and vehicle database (including Amber Alerts, BOLOs)
- Kafka or Amazon Kinesis for ingestion of streaming alerts from edge
- Event-driven tracking engine with correlation logic
- Functions:
- Receives edge alerts with time and GPS
- Matches incoming data against Amber Alert and criminal databases
- Sends tracking commands to all edge devices and infrastructure cameras
- Provides UI dashboard with tracking map, metadata display, and video review
- Components:
- User Interface and Monitoring Center
- Components:
- Web-based UI/dashboard
- Real-time map view of vehicle movement
- Alert notification system (email, SMS, system messages)
- Secure video playback and query tools
- Functions:
- Allows law enforcement to view incoming detections
- Enables manual override and command broadcast
- Supports pulling video from infrastructure cameras only when confirmed sightings occur
- Components:
- Vehicle Recognition Model Stack
- YOLOv5 (License Plate Detection)
- EasyOCR (Character Recognition)
- Custom CNN or pretrained classifier (Vehicle Make/Model/Color)
- RapidFuzz (Text similarity and consolidation)
Advantages of the System
- Real-time Detection: Alerts are sent instantly upon detection.
- Bandwidth Efficient: Video is only streamed when requested, reducing cloud costs.
- Extensible: Can incorporate facial recognition, vehicle re-identification, and behavioral analytics in future phases.
- Safety: Reduces police pursuit reliance, enabling safer community policing.
- Scalability: Easily expandable to more vehicles, cameras, and jurisdictions.
Conclusion
This AI-powered vehicle tracking system integrates edge and cloud technologies to improve the speed, safety, and effectiveness of law enforcement response. It enables smart city infrastructure to assist in real-time tracking of vehicles involved in Amber Alerts or criminal investigations, helping to safeguard lives while minimizing public risk.
Edge Architecture Overview
The camera system used in vehicles can be built using the following hardware and software. This document outlines a real-time license plate recognition (LPR) and vehicle re-identification (ReID) pipeline for deployment on NVIDIA Jetson devices (Orin NX / Xavier NX). The pipeline utilizes optimized YOLOv8 for plate detection, PaddleOCR for character recognition, and ResNet-based ReID models to identify vehicles even with partial plate visibility.
Pipeline Components
1. **Video Input**: CSI or USB camera (1080p @ 30–60 FPS)
2. **YOLOv8 (TensorRT)**: Detects license plate bounding boxes
3. **PaddleOCR (ONNX/TensorRT)**: Recognizes text inside bounding box
4. **Vehicle ReID (ResNet-50 / VeRi-776)**: Extracts vehicle feature vectors
5. **Similarity Matching**: Cosine similarity to match known vehicles or interpolate matches from other cameras
6. **Streaming Interface**: RTSP/WebRTC for output and REST/gRPC API for matches
Hardware Requirements
– NVIDIA Jetson Orin NX / Xavier NX
– 8GB+ RAM
– CSI camera (e.g., Leopard Imaging) or USB3 camera
– SSD storage for temporary buffering
– LTE/5G module or Wi-Fi for cloud connectivity
POC code snippet
Here is a sample code snippet written using an open source license plate model along with YOLO5 plate recognition software. It uses EasyOCR to do the character recognition and loads sample video file. For processing the video, I used OpenCV. The first version of the code was generated using ChatGPT, but it was very crude and needed a lot of work to refine the capabilities. The libraries were mismatched and it didn’t even run. I started using YOLO8 but it was incompatible with the open source version of the model that was pre-built. I ended up downgrading to YOLO5. It initially picked up all of the text around the license plate area and did it for each frame of the video. With 1000s of frames that can show partial plates while the camera pans the vehicle, you can get a lot of additional plates with less characters and words from license plate holders. To clean up the number of plates and focus on the primary plate numbers I used RapidFuzz for Fuzzy matching of plates. I modified the code to merge similar text from adjacent frames. I had the logic focus on the largest text first and use it as the key in a comma delimited file with the other text show in a separate column to provide additional information. The code is in Python, but should be in something faster like GoLang, but this used Pytorch libraries and was easy to test with. I added some debug code to display the video with the boxes around the plates for testing that the model was picking up the plates properly as I had to adjust the confidence, padding of the box, and the matching threshold for the fuzzy logic matching.
Here is the code
"""
License Plate Recognition Script
--------------------------------
This script detects license plates in a video using a YOLOv5 model and extracts text from them using EasyOCR.
Dependencies:
- YOLOv5 model (torch-based) from 'trained_models/best.pt'
- EasyOCR for OCR (MIT License)
- RapidFuzz for fuzzy matching
- OpenCV for video processing
Usage:
python run_ssd_example.py --log debug
Log Levels:
--log debug : Verbose output with all detection steps
--log info : General operational information (default)
--log warning : Warnings only
--log error : Errors only
--log critical : Critical failures only
"""
import os
import sys
import cv2
import csv
import torch
import numpy as np
import easyocr
import logging
import argparse
from pathlib import Path
from collections import defaultdict
from rapidfuzz import fuzz
# --- Argument Parser for Logging Level ---
parser = argparse.ArgumentParser(description="License Plate Recognition Script")
parser.add_argument('--log', default='info', choices=['debug', 'info', 'warning', 'error', 'critical'],
help='Set the logging level (default: info)')
args = parser.parse_args()
log_levels = {
'debug': logging.DEBUG,
'info': logging.INFO,
'warning': logging.WARNING,
'error': logging.ERROR,
'critical': logging.CRITICAL
}
logging.basicConfig(
level=log_levels[args.log],
format="%(asctime)s [%(levelname)s] %(message)s",
handlers=[logging.StreamHandler()]
)
# --- Configure YOLOv5 path ---
FILE = Path(__file__).resolve()
ROOT = FILE.parents[1] / 'yolov5'
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT))
# Fix path so yolov5 internal imports like 'from utils import TryExcept' work
YOLOV5_PATH = Path(__file__).resolve().parent / "yolov5"
if str(YOLOV5_PATH) not in sys.path:
sys.path.append(str(YOLOV5_PATH))
from yolov5.models.common import DetectMultiBackend
from utils.general import non_max_suppression
from utils.torch_utils import select_device
# --- Configuration ---
video_path = 'videos/LPR_Project.mp4'
model_path = 'trained_models/best.pt'
output_csv_path = 'final_detected_plates.csv'
min_confidence = 0.6
fuzzy_match_threshold = 85
min_range_len = 2
# --- Device and model setup ---
device = select_device('cpu')
model = DetectMultiBackend(model_path, device=device, dnn=False)
model.eval()
ocr_reader = easyocr.Reader(['en'], gpu=False)
# --- Process video frames ---
cap = cv2.VideoCapture(video_path)
frame_num = 0
plate_detections = []
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
original_h, original_w = frame.shape[:2]
img = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
img_resized = cv2.resize(img, (640, 640))
img_tensor = torch.from_numpy(img_resized).permute(2, 0, 1).float().unsqueeze(0) / 255.0
img_tensor = img_tensor.to(device)
with torch.no_grad():
pred = model(img_tensor, augment=False, visualize=False)
pred = non_max_suppression(pred, conf_thres=0.25, iou_thres=0.45, classes=None, agnostic=False)
for det in pred:
if len(det):
for *xyxy, conf, cls in det:
# Rescale coordinates to original frame
x1, y1, x2, y2 = map(int, [xyxy[0] * original_w / 640,
xyxy[1] * original_h / 640,
xyxy[2] * original_w / 640,
xyxy[3] * original_h / 640])
pad = 10
h, w = frame.shape[:2]
x1_p, y1_p = max(x1 - pad, 0), max(y1 - pad, 0)
x2_p, y2_p = min(x2 + pad, w), min(y2 + pad, h)
cropped_plate = frame[y1_p:y2_p, x1_p:x2_p]
ocr_results = ocr_reader.readtext(cropped_plate)
all_texts = []
if ocr_results:
largest = max(
ocr_results,
key=lambda r: (r[0][1][0] - r[0][0][0]) * (r[0][2][1] - r[0][0][1])
)
text, conf_ocr = largest[1], largest[2]
all_texts = [f"{res[1].strip().replace(' ', '').replace('-', '').upper()}({res[2]:.2f})"
for res in ocr_results]
if conf_ocr >= min_confidence:
cleaned = text.strip().replace(" ", "").replace("-", "").upper()
if 5 <= len(cleaned) <= 8 and cleaned.isalnum():
plate_detections.append((frame_num, cleaned, all_texts))
logging.debug(f"Frame {frame_num}: Detected plate '{cleaned}' with conf {conf_ocr:.2f}")
cv2.rectangle(frame, (x1_p, y1_p), (x2_p, y2_p), (0, 255, 0), 2)
cv2.putText(frame, cleaned, (x1_p, y1_p - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 0, 0), 2)
else:
logging.info(f"Frame {frame_num}: Ignored text '{cleaned}' (length or format mismatch)")
else:
logging.info(f"Frame {frame_num}: OCR confidence {conf_ocr:.2f} below threshold")
else:
logging.debug(f"Frame {frame_num}: No OCR results found")
cv2.imshow('Frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
frame_num += 1
cap.release()
cv2.destroyAllWindows()
# --- Group detections by text ---
detections_by_text = defaultdict(list)
for frame, plate, _ in plate_detections:
detections_by_text[plate].append(frame)
# --- Build frame ranges ---
range_groups = []
for plate, frames in detections_by_text.items():
frames.sort()
start = prev = frames[0]
for f in frames[1:]:
if f == prev + 1:
prev = f
else:
if prev - start + 1 >= min_range_len:
range_groups.append({'start': start, 'end': prev, 'plate': plate})
start = prev = f
if prev - start + 1 >= min_range_len:
range_groups.append({'start': start, 'end': prev, 'plate': plate})
# --- Merge fuzzy duplicates ---
def merge_ranges(ranges):
ranges.sort(key=lambda x: x['start'])
merged = []
for r in ranges:
added = False
for m in merged:
if r['start'] <= m['end'] + 1 and fuzz.ratio(r['plate'], m['plate']) >= fuzzy_match_threshold:
m['start'] = min(m['start'], r['start'])
m['end'] = max(m['end'], r['end'])
if len(r['plate']) > len(m['plate']):
m['plate'] = r['plate']
added = True
break
if not added:
merged.append(r.copy())
return merged
consolidated = merge_ranges(range_groups)
# --- Final deduplication and export ---
final = []
seen = set()
for item in consolidated:
if item['plate'] not in seen:
final.append(item)
seen.add(item['plate'])
with open(output_csv_path, 'w', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow(['FrameStart', 'FrameEnd', 'PlateID', 'AllText'])
for entry in final:
matching_texts = [
t for (f, plate, t) in plate_detections
if plate == entry['plate'] and entry['start'] <= f <= entry['end']
]
all_text_flat = "|".join(matching_texts[0]) if matching_texts else ""
csv_writer.writerow([entry['start'], entry['end'], entry['plate'], all_text_flat])
logging.info(f"✅ Detection complete. Output written to {output_csv_path}")
🚀 Cloud-Based System Design in AWS
🎯 Objectives
- Ingest and process real-time and batch video feeds.
- Detect vehicles and license plates using AI/ML models.
- Identify vehicles even when license plates are obscured (make, model, color).
- Log and cross-reference against law enforcement or Amber Alert databases.
- Send alerts and visualize data for authorized agencies.
🧱 Architecture Overview
📍 1. Edge Capture Devices (Police/City Cameras)
- Hardware: NVIDIA Jetson devices or compatible IP cameras
- Responsibility: Capture video, run YOLOv5 models locally, extract plate data, GPS.
- Output: JSON message with:
- Plate text
- Vehicle attributes (make/model/color)
- GPS coordinates
- Timestamps
- Optional: Short video clip (compressed)
☁️ 2. Ingestion Layer (AWS Cloud)
| Component | AWS Service | Description |
|---|---|---|
| 📡 Message Gateway | AWS IoT Core or Amazon API Gateway | Receives messages from Edge nodes securely (MQTT or REST) |
| 🔁 Data Stream | Amazon Kinesis Data Streams | Collects and buffers incoming messages for downstream processing |
| 🛡️ Security | AWS IAM, Cognito, TLS Encryption | Ensures only authorized devices and officers send/receive data |
🧠 3. AI Inference & Analysis (Optional in Cloud)
| Component | AWS Service | Description |
|---|---|---|
| 🔍 Fallback AI Processing | Amazon SageMaker, Lambda, or ECS Fargate | In case Edge misses a detection, re-run ML models in the cloud (e.g., blurred plates, multiple angles) |
| 🧠 Model Storage | S3 or EFS | Stores YOLOv5 weights, vehicle detection models |
| 🔄 Event Logic | AWS Lambda | Detects matches with Amber Alert or criminal DB; triggers alerts |
| 🧠 Vehicle Matching | Amazon Rekognition Custom Labels (optional) | For make/model/color classification from video frames |
🗃️ 4. Central Database & Cross-Referencing
| Component | AWS Service | Description |
|---|---|---|
| 📘 License & Vehicle DB | Amazon DynamoDB or Aurora Serverless | Stores sightings of plates, vehicle types, locations |
| 👮♀️ Crime/Amber Alert DB | Amazon RDS (PostgreSQL) or external | Maintains suspect/alert records to be matched in real-time |
| 📍 Geospatial Querying | Amazon Location Service or PostGIS | Enables tracking and plotting of movement patterns |
🛎️ 5. Notifications and Alerts
| Component | AWS Service | Description |
|---|---|---|
| 📲 Real-Time Alerts | Amazon SNS or AWS AppSync | Sends push/email alerts to officers or dispatch when a match is found |
| 🎥 Video Access on Demand | Amazon Kinesis Video Streams, S3 | Allows temporary cloud video pull if triggered by alert/tracking |
| 📈 Dashboards | Amazon QuickSight or Grafana | Visualizes vehicle movements, matches, hot zones |
🧑💻 6. Admin Dashboard (Web UI)
| Component | AWS Service | Description |
|---|---|---|
| 🧭 Frontend Hosting | Amazon S3 + CloudFront | Serves a secure web UI to law enforcement or government users |
| 🌐 Backend APIs | AWS API Gateway + Lambda | Manages plate queries, region lookups, alert confirmations |
| 👥 Access Control | AWS Cognito + IAM Roles | Manages roles for officers, analysts, and system admins |
🔐 Security & Compliance
- Data encryption: TLS in transit, KMS for storage
- Audit logging: AWS CloudTrail + CloudWatch
- Access control: Role-based with multi-factor authentication
- Compliance: CJIS, GDPR, SOC2 readiness
📊 Data Flow Summary
graph TD
A[Edge Cameras] –>|YOLOv5 Detection + GPS| B[Kinesis Stream]
B –> C[Lambda Matching Logic]
C –> D{Is Plate Match?}
D –>|Yes| E[Send Alert via SNS]
D –>|No| F[Store in DB]
F –> G[DynamoDB / RDS]
G –> H[Dashboard / Map Overlay]
E –> H
💡 Optional Enhancements
- Offline mode for vehicles in rural areas (store & forward)
- Federated search to allow police to query across agencies
- Plate tampering detection (detect modified or covered plates)
- Anomaly detection to identify abnormal traffic patterns
🔧 Recommended NVIDIA Jetson Edge Devices
According to performance specs and real-time license plate OCR benchmarks:
Jetson Nano (4 GB)
- GPU: Maxwell with 128 CUDA cores (~0.47 TFLOPS FP16)
- Inference performance: ~14 FPS using optimized models like Light-Edge (ResNet‑18 + CTC recognition) at ~4.8 W power
- Best for budget-sensitive deployments with modest real-time needs
Jetson Xavier NX (8 GB)
- GPU: Volta-based 384 CUDA cores, 48 Tensor cores (approx. 21 TOPS INT8)
- Ideal for multi-stream inference, license plate detection plus make/model/color models
- Can support video resolutions at 1080p and potentially 4K at moderate fps
Jetson Orin Nano / Orin NX (8–16 GB)
- GPU: Ampere architecture, up to 1,024 CUDA cores, 2nd Gen Tensor cores (up to ~40 TOPS)
- High performance with low power envelope (7–15 W)
- Excellent for real-time multi-model workloads including ALPR and vehicle recognition
Jetson AGX Orin (32 GB)
- Up to ~64 GB LPDDR5, Ampere GPU with 2,048 CUDA cores (~275 TOPS INT8), built-in DL accelerators
- Ideal for multi-camera setups, high-speed plate tracking, and advanced analytics onboard vehicle units
📸 Recommended Camera Modules
Selecting a high-fidelity camera with low motion blur and high frame-rate is key for high-speed vehicle capture.
e-CAM24_CUNX (e-con Systems)
- Sensor: ON Semiconductor AR0234 1/2.6” global shutter
- Supports up to 120 FPS at 1080p resolution for motion-free capture of fast-moving vehicles
XIMEA CMV4000 (PCIe)
- Sensor: CMOSIS CMV4000, 4.2 MP, global shutter, Near‑Infrared (NIR) sensitive
- PCIe interface enables direct memory transfer with minimal CPU overhead and low latency—ideal for real-time edge processing
📊 Summary Table
| Component | Specification | Performance Benefit |
|---|---|---|
| Jetson Nano (4 GB) | Maxwell GPU, 128 cores, ~4.8 W usage | ~14 FPS OCR (optimized models), cost-effective |
| Jetson Xavier NX | Volta GPU, 21 TOPS INT8, compact form factor | Supports multiple camera streams and vehicle classification |
| Jetson Orin Nano/NX | Ampere GPU, up to 40 TOPS INT8, 7–15 W TDP | High throughput ALPR & MMR in compact systems |
| Jetson AGX Orin | Ampere GPU ~275 TOPS INT8, expansive I/O | High-throughput inference on multiple cameras/inference pipes |
| e-CAM24_CUNX camera | 1080p @120 FPS global shutter CMOS sensor | Great for minimal motion blur when vehicles pass at speed |
| XIMEA CMV4000 camera | 4.2 MP PCIe NIR camera | Near real-time capture with very low latency |
✅ Which Setup for Your Use Case?
- Mobile Police Vehicle (fast-moving, limited space): Use Jetson Orin Nano (or Xavier NX) with e-CAM24_CUNX for high-framerate plate capture under motion.
- Infrastructure Traffic Camera (fixed, high-res): Pair Jetson AGX Orin with high-res global shutter PCIe modules (XIMEA CMV4000) for superior capture, classification and tracking.
- Budget / Low-power Deployments: Jetson Nano series with USB or MIPI camera modules is acceptable for lower vehicle throughput or slower areas.
🔍 ALPR Model Performance Example:
The Light‑Edge model deployed via TensorRT achieves 14 FPS at ~4.8W on Jetson Nano, delivering over 90% mAP while doubling speed‑per‑watt compared to earlier models .
🔚 Key Takeaways
- Device & Camera Matching: Select a Jetson variant that supports your required frame rate, inference load, environmental constraints, and power budget.
- Strobe/IR Illumination: Consider NIR sensitive or global shutter cameras for low-light or night scenarios.
- Model Optimization: Use INT8 quantization (TensorRT) and lightweight models for real-time ALPR performance on constrained devices.