AI-Powered Vehicle-Based License Plate Recognition System

Executive Summary This document outlines a proposed AI-powered vehicle tracking and identification system designed to enhance law enforcement capabilities in two critical scenarios: (1) Amber Alerts involving child abduction cases, and (2) incidents involving vehicles used in crimes, including high-speed pursuits. The system combines edge-based license plate recognition (LPR), vehicle feature identification (make, model, color), and centralized AI coordination across city-wide infrastructure to enable accurate, real-time tracking.


1. Introduction Law enforcement agencies face growing challenges in responding to Amber Alerts and locating vehicles involved in criminal activity. Traditional methods are often reactive, relying on radio reports, eyewitness accounts, or incomplete surveillance. High-speed pursuits endanger both officers and civilians, and obscured or missing license plates can render many systems ineffective.

This system leverages real-time AI on edge devices and centralized cloud coordination to overcome these obstacles. It enables authorities to identify, track, and locate suspect vehicles throughout a geographic region with minimal manual intervention.


2. System Overview The solution consists of two major components:

A. Edge-Based Vehicle Recognition Units:

  • Mounted on police vehicles and public infrastructure (traffic lights, overpasses, parking lots)
  • AI models trained for license plate recognition (YOLOv5 + EasyOCR)
  • Vehicle classification models to identify make, model, and color
  • GPS-tagged detection logs for location-based tracking
  • Sends minimal data (plate ID, vehicle features, timestamp, GPS) to central server

B. Centralized Command & Control System:

  • Ingests detection logs via secure, real-time messaging (e.g., Amazon Kinesis, MQTT)
  • Matches results against police databases and Amber Alert bulletins
  • Maintains vehicle state and trajectory across regions
  • Alerts patrol units when suspect vehicles are reacquired by infrastructure cameras
  • Requests video feed only when actionable (e.g., vehicle reacquired)

3. Use Case 1: Amber Alert Tracking

In an Amber Alert, time is critical. The suspect vehicle may be reported with only partial information—a partial plate, vehicle make, or color. This system enables the following:

  • All edge cameras scan and report vehicles matching the alert’s description
  • Even if the plate is missing or obstructed, the system can match based on model and color
  • As the vehicle moves, detections are chained across the region, mapping the route
  • Law enforcement is notified at key checkpoints, improving interception chances
  • Video evidence can be pulled only after confirmation to conserve bandwidth

4. Use Case 2: Crime Vehicle Pursuit and Region-Wide Tracking

In criminal activity (e.g., robbery, assault), suspects often flee in a vehicle. Rather than engaging in a dangerous high-speed chase:

  • The officer records the fleeing vehicle using dashcam or mobile device
  • Edge system in the car recognizes plate (if visible) and vehicle features
  • Tracking mode is triggered: all city infrastructure is alerted to watch for the vehicle
  • As cameras reacquire the suspect vehicle, real-time GPS is sent to the command center
  • Officers are directed to intercept at low-risk points or to set up roadblocks
  • Eliminates high-speed chases, reducing injury risk and increasing apprehension rates

5. Handling Obscured or Missing Plates

License plates may be deliberately covered, damaged, or otherwise unreadable. The system addresses this with:

  • Vehicle shape and feature detection using CNN-based classifiers
  • Color histograms and model identification (e.g., “White Toyota Camry”)
  • Temporal tracking: matching trajectory and appearance across cameras
  • Multi-modal tracking using both LPR and vehicle signature vectors

6. Privacy and Security Considerations

  • Only minimal data is transmitted: no PII unless match is found in law enforcement database
  • Full video feed is requested only when a match is confirmed
  • End-to-end encryption of all detection logs
  • Full audit logs and access control for all video and data retrieval

7. Advantages

  • Faster response to Amber Alerts with automated vehicle tracing
  • Reduced reliance on high-speed pursuit, improving public and officer safety
  • Higher apprehension rates for fleeing suspects
  • Less bandwidth usage via selective video stream access
  • Works with both license plate and appearance-based recognition
  • Modular, scalable architecture using edge AI and cloud orchestration

8. Conclusion The proposed system transforms law enforcement vehicle detection from a passive, manual process to a proactive, intelligent network capable of tracking suspect vehicles across entire cities. Whether rescuing a child in danger or intercepting criminals safely, the system enables a new level of efficiency and safety for public servants and citizens alike.

Component Architecture Breakdown

  1. Edge Detection Device (In Police Vehicles and Traffic Cameras)
    • Components:
      • Video camera module (vehicle- or infrastructure-mounted)
      • Jetson-based processor running YOLOv5 model
      • GPS module for precise location
      • Software stack: PyTorch, YOLOv5 model (for license plate and vehicle detection), EasyOCR, RapidFuzz for fuzzy matching
    • Functions:
      • Capture real-time video
      • Detect license plates, vehicle shape, make, color
      • Perform OCR and match against watchlists
      • Send minimal metadata (plate number, GPS, frame) to Central System
  2. Centralized Tracking System (Cloud-hosted)
    • Components:
      • Cloud server or Kubernetes-hosted REST API
      • License plate and vehicle database (including Amber Alerts, BOLOs)
      • Kafka or Amazon Kinesis for ingestion of streaming alerts from edge
      • Event-driven tracking engine with correlation logic
    • Functions:
      • Receives edge alerts with time and GPS
      • Matches incoming data against Amber Alert and criminal databases
      • Sends tracking commands to all edge devices and infrastructure cameras
      • Provides UI dashboard with tracking map, metadata display, and video review
  3. User Interface and Monitoring Center
    • Components:
      • Web-based UI/dashboard
      • Real-time map view of vehicle movement
      • Alert notification system (email, SMS, system messages)
      • Secure video playback and query tools
    • Functions:
      • Allows law enforcement to view incoming detections
      • Enables manual override and command broadcast
      • Supports pulling video from infrastructure cameras only when confirmed sightings occur
  4. Vehicle Recognition Model Stack
    • YOLOv5 (License Plate Detection)
    • EasyOCR (Character Recognition)
    • Custom CNN or pretrained classifier (Vehicle Make/Model/Color)
    • RapidFuzz (Text similarity and consolidation)

Advantages of the System

  • Real-time Detection: Alerts are sent instantly upon detection.
  • Bandwidth Efficient: Video is only streamed when requested, reducing cloud costs.
  • Extensible: Can incorporate facial recognition, vehicle re-identification, and behavioral analytics in future phases.
  • Safety: Reduces police pursuit reliance, enabling safer community policing.
  • Scalability: Easily expandable to more vehicles, cameras, and jurisdictions.

Conclusion

This AI-powered vehicle tracking system integrates edge and cloud technologies to improve the speed, safety, and effectiveness of law enforcement response. It enables smart city infrastructure to assist in real-time tracking of vehicles involved in Amber Alerts or criminal investigations, helping to safeguard lives while minimizing public risk.

Edge Architecture Overview

The camera system used in vehicles can be built using the following hardware and software. This document outlines a real-time license plate recognition (LPR) and vehicle re-identification (ReID) pipeline for deployment on NVIDIA Jetson devices (Orin NX / Xavier NX). The pipeline utilizes optimized YOLOv8 for plate detection, PaddleOCR for character recognition, and ResNet-based ReID models to identify vehicles even with partial plate visibility.

Pipeline Components

1. **Video Input**: CSI or USB camera (1080p @ 30–60 FPS)
2. **YOLOv8 (TensorRT)**: Detects license plate bounding boxes
3. **PaddleOCR (ONNX/TensorRT)**: Recognizes text inside bounding box
4. **Vehicle ReID (ResNet-50 / VeRi-776)**: Extracts vehicle feature vectors
5. **Similarity Matching**: Cosine similarity to match known vehicles or interpolate matches from other cameras
6. **Streaming Interface**: RTSP/WebRTC for output and REST/gRPC API for matches

Hardware Requirements

– NVIDIA Jetson Orin NX / Xavier NX
– 8GB+ RAM
– CSI camera (e.g., Leopard Imaging) or USB3 camera
– SSD storage for temporary buffering
– LTE/5G module or Wi-Fi for cloud connectivity

POC code snippet

Here is a sample code snippet written using an open source license plate model along with YOLO5 plate recognition software. It uses EasyOCR to do the character recognition and loads sample video file. For processing the video, I used OpenCV. The first version of the code was generated using ChatGPT, but it was very crude and needed a lot of work to refine the capabilities. The libraries were mismatched and it didn’t even run. I started using YOLO8 but it was incompatible with the open source version of the model that was pre-built. I ended up downgrading to YOLO5. It initially picked up all of the text around the license plate area and did it for each frame of the video. With 1000s of frames that can show partial plates while the camera pans the vehicle, you can get a lot of additional plates with less characters and words from license plate holders. To clean up the number of plates and focus on the primary plate numbers I used RapidFuzz for Fuzzy matching of plates. I modified the code to merge similar text from adjacent frames. I had the logic focus on the largest text first and use it as the key in a comma delimited file with the other text show in a separate column to provide additional information. The code is in Python, but should be in something faster like GoLang, but this used Pytorch libraries and was easy to test with. I added some debug code to display the video with the boxes around the plates for testing that the model was picking up the plates properly as I had to adjust the confidence, padding of the box, and the matching threshold for the fuzzy logic matching.

Here is the code

"""
License Plate Recognition Script
--------------------------------
This script detects license plates in a video using a YOLOv5 model and extracts text from them using EasyOCR.

Dependencies:
- YOLOv5 model (torch-based) from 'trained_models/best.pt'
- EasyOCR for OCR (MIT License)
- RapidFuzz for fuzzy matching
- OpenCV for video processing

Usage:
python run_ssd_example.py --log debug

Log Levels:
--log debug : Verbose output with all detection steps
--log info : General operational information (default)
--log warning : Warnings only
--log error : Errors only
--log critical : Critical failures only
"""

import os
import sys
import cv2
import csv
import torch
import numpy as np
import easyocr
import logging
import argparse
from pathlib import Path
from collections import defaultdict
from rapidfuzz import fuzz

# --- Argument Parser for Logging Level ---
parser = argparse.ArgumentParser(description="License Plate Recognition Script")
parser.add_argument('--log', default='info', choices=['debug', 'info', 'warning', 'error', 'critical'],
help='Set the logging level (default: info)')
args = parser.parse_args()

log_levels = {
'debug': logging.DEBUG,
'info': logging.INFO,
'warning': logging.WARNING,
'error': logging.ERROR,
'critical': logging.CRITICAL
}
logging.basicConfig(
level=log_levels[args.log],
format="%(asctime)s [%(levelname)s] %(message)s",
handlers=[logging.StreamHandler()]
)

# --- Configure YOLOv5 path ---
FILE = Path(__file__).resolve()
ROOT = FILE.parents[1] / 'yolov5'
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT))
# Fix path so yolov5 internal imports like 'from utils import TryExcept' work
YOLOV5_PATH = Path(__file__).resolve().parent / "yolov5"
if str(YOLOV5_PATH) not in sys.path:
sys.path.append(str(YOLOV5_PATH))
from yolov5.models.common import DetectMultiBackend
from utils.general import non_max_suppression
from utils.torch_utils import select_device

# --- Configuration ---
video_path = 'videos/LPR_Project.mp4'
model_path = 'trained_models/best.pt'
output_csv_path = 'final_detected_plates.csv'
min_confidence = 0.6
fuzzy_match_threshold = 85
min_range_len = 2

# --- Device and model setup ---
device = select_device('cpu')
model = DetectMultiBackend(model_path, device=device, dnn=False)
model.eval()
ocr_reader = easyocr.Reader(['en'], gpu=False)

# --- Process video frames ---
cap = cv2.VideoCapture(video_path)
frame_num = 0
plate_detections = []

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break

original_h, original_w = frame.shape[:2]

img = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
img_resized = cv2.resize(img, (640, 640))
img_tensor = torch.from_numpy(img_resized).permute(2, 0, 1).float().unsqueeze(0) / 255.0
img_tensor = img_tensor.to(device)

with torch.no_grad():
pred = model(img_tensor, augment=False, visualize=False)
pred = non_max_suppression(pred, conf_thres=0.25, iou_thres=0.45, classes=None, agnostic=False)

for det in pred:
if len(det):
for *xyxy, conf, cls in det:
# Rescale coordinates to original frame
x1, y1, x2, y2 = map(int, [xyxy[0] * original_w / 640,
xyxy[1] * original_h / 640,
xyxy[2] * original_w / 640,
xyxy[3] * original_h / 640])

pad = 10
h, w = frame.shape[:2]
x1_p, y1_p = max(x1 - pad, 0), max(y1 - pad, 0)
x2_p, y2_p = min(x2 + pad, w), min(y2 + pad, h)

cropped_plate = frame[y1_p:y2_p, x1_p:x2_p]
ocr_results = ocr_reader.readtext(cropped_plate)

all_texts = []
if ocr_results:
largest = max(
ocr_results,
key=lambda r: (r[0][1][0] - r[0][0][0]) * (r[0][2][1] - r[0][0][1])
)
text, conf_ocr = largest[1], largest[2]

all_texts = [f"{res[1].strip().replace(' ', '').replace('-', '').upper()}({res[2]:.2f})"
for res in ocr_results]

if conf_ocr >= min_confidence:
cleaned = text.strip().replace(" ", "").replace("-", "").upper()
if 5 <= len(cleaned) <= 8 and cleaned.isalnum():
plate_detections.append((frame_num, cleaned, all_texts))
logging.debug(f"Frame {frame_num}: Detected plate '{cleaned}' with conf {conf_ocr:.2f}")
cv2.rectangle(frame, (x1_p, y1_p), (x2_p, y2_p), (0, 255, 0), 2)
cv2.putText(frame, cleaned, (x1_p, y1_p - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 0, 0), 2)
else:
logging.info(f"Frame {frame_num}: Ignored text '{cleaned}' (length or format mismatch)")
else:
logging.info(f"Frame {frame_num}: OCR confidence {conf_ocr:.2f} below threshold")
else:
logging.debug(f"Frame {frame_num}: No OCR results found")

cv2.imshow('Frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

frame_num += 1

cap.release()
cv2.destroyAllWindows()

# --- Group detections by text ---
detections_by_text = defaultdict(list)
for frame, plate, _ in plate_detections:
detections_by_text[plate].append(frame)

# --- Build frame ranges ---
range_groups = []
for plate, frames in detections_by_text.items():
frames.sort()
start = prev = frames[0]
for f in frames[1:]:
if f == prev + 1:
prev = f
else:
if prev - start + 1 >= min_range_len:
range_groups.append({'start': start, 'end': prev, 'plate': plate})
start = prev = f
if prev - start + 1 >= min_range_len:
range_groups.append({'start': start, 'end': prev, 'plate': plate})

# --- Merge fuzzy duplicates ---
def merge_ranges(ranges):
ranges.sort(key=lambda x: x['start'])
merged = []
for r in ranges:
added = False
for m in merged:
if r['start'] <= m['end'] + 1 and fuzz.ratio(r['plate'], m['plate']) >= fuzzy_match_threshold:
m['start'] = min(m['start'], r['start'])
m['end'] = max(m['end'], r['end'])
if len(r['plate']) > len(m['plate']):
m['plate'] = r['plate']
added = True
break
if not added:
merged.append(r.copy())
return merged

consolidated = merge_ranges(range_groups)

# --- Final deduplication and export ---
final = []
seen = set()
for item in consolidated:
if item['plate'] not in seen:
final.append(item)
seen.add(item['plate'])

with open(output_csv_path, 'w', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow(['FrameStart', 'FrameEnd', 'PlateID', 'AllText'])
for entry in final:
matching_texts = [
t for (f, plate, t) in plate_detections
if plate == entry['plate'] and entry['start'] <= f <= entry['end']
]
all_text_flat = "|".join(matching_texts[0]) if matching_texts else ""
csv_writer.writerow([entry['start'], entry['end'], entry['plate'], all_text_flat])

logging.info(f"✅ Detection complete. Output written to {output_csv_path}")

🚀 Cloud-Based System Design in AWS

🎯 Objectives

  • Ingest and process real-time and batch video feeds.
  • Detect vehicles and license plates using AI/ML models.
  • Identify vehicles even when license plates are obscured (make, model, color).
  • Log and cross-reference against law enforcement or Amber Alert databases.
  • Send alerts and visualize data for authorized agencies.

🧱 Architecture Overview

📍 1. Edge Capture Devices (Police/City Cameras)

  • Hardware: NVIDIA Jetson devices or compatible IP cameras
  • Responsibility: Capture video, run YOLOv5 models locally, extract plate data, GPS.
  • Output: JSON message with:
    • Plate text
    • Vehicle attributes (make/model/color)
    • GPS coordinates
    • Timestamps
    • Optional: Short video clip (compressed)



☁️ 2. Ingestion Layer (AWS Cloud)

ComponentAWS ServiceDescription
📡 Message GatewayAWS IoT Core or Amazon API GatewayReceives messages from Edge nodes securely (MQTT or REST)
🔁 Data StreamAmazon Kinesis Data StreamsCollects and buffers incoming messages for downstream processing
🛡️ SecurityAWS IAM, Cognito, TLS EncryptionEnsures only authorized devices and officers send/receive data



🧠 3. AI Inference & Analysis (Optional in Cloud)

ComponentAWS ServiceDescription
🔍 Fallback AI ProcessingAmazon SageMaker, Lambda, or ECS FargateIn case Edge misses a detection, re-run ML models in the cloud (e.g., blurred plates, multiple angles)
🧠 Model StorageS3 or EFSStores YOLOv5 weights, vehicle detection models
🔄 Event LogicAWS LambdaDetects matches with Amber Alert or criminal DB; triggers alerts
🧠 Vehicle MatchingAmazon Rekognition Custom Labels (optional)For make/model/color classification from video frames



🗃️ 4. Central Database & Cross-Referencing

ComponentAWS ServiceDescription
📘 License & Vehicle DBAmazon DynamoDB or Aurora ServerlessStores sightings of plates, vehicle types, locations
👮‍♀️ Crime/Amber Alert DBAmazon RDS (PostgreSQL) or externalMaintains suspect/alert records to be matched in real-time
📍 Geospatial QueryingAmazon Location Service or PostGISEnables tracking and plotting of movement patterns



🛎️ 5. Notifications and Alerts

ComponentAWS ServiceDescription
📲 Real-Time AlertsAmazon SNS or AWS AppSyncSends push/email alerts to officers or dispatch when a match is found
🎥 Video Access on DemandAmazon Kinesis Video Streams, S3Allows temporary cloud video pull if triggered by alert/tracking
📈 DashboardsAmazon QuickSight or GrafanaVisualizes vehicle movements, matches, hot zones



🧑‍💻 6. Admin Dashboard (Web UI)

ComponentAWS ServiceDescription
🧭 Frontend HostingAmazon S3 + CloudFrontServes a secure web UI to law enforcement or government users
🌐 Backend APIsAWS API Gateway + LambdaManages plate queries, region lookups, alert confirmations
👥 Access ControlAWS Cognito + IAM RolesManages roles for officers, analysts, and system admins

🔐 Security & Compliance

  • Data encryption: TLS in transit, KMS for storage
  • Audit logging: AWS CloudTrail + CloudWatch
  • Access control: Role-based with multi-factor authentication
  • Compliance: CJIS, GDPR, SOC2 readiness



📊 Data Flow Summary
graph TD
A[Edge Cameras] –>|YOLOv5 Detection + GPS| B[Kinesis Stream]
B –> C[Lambda Matching Logic]
C –> D{Is Plate Match?}
D –>|Yes| E[Send Alert via SNS]
D –>|No| F[Store in DB]
F –> G[DynamoDB / RDS]
G –> H[Dashboard / Map Overlay]
E –> H

💡 Optional Enhancements

  • Offline mode for vehicles in rural areas (store & forward)
  • Federated search to allow police to query across agencies
  • Plate tampering detection (detect modified or covered plates)
  • Anomaly detection to identify abnormal traffic patterns

🔧 Recommended NVIDIA Jetson Edge Devices

According to performance specs and real-time license plate OCR benchmarks:

Jetson Nano (4 GB)

  • GPU: Maxwell with 128 CUDA cores (~0.47 TFLOPS FP16)
  • Inference performance: ~14 FPS using optimized models like Light-Edge (ResNet‑18 + CTC recognition) at ~4.8 W power 
  • Best for budget-sensitive deployments with modest real-time needs

Jetson Xavier NX (8 GB)

  • GPU: Volta-based 384 CUDA cores, 48 Tensor cores (approx. 21 TOPS INT8) 
  • Ideal for multi-stream inference, license plate detection plus make/model/color models
  • Can support video resolutions at 1080p and potentially 4K at moderate fps

Jetson Orin Nano / Orin NX (8–16 GB)

  • GPU: Ampere architecture, up to 1,024 CUDA cores, 2nd Gen Tensor cores (up to ~40 TOPS)
  • High performance with low power envelope (7–15 W) 
  • Excellent for real-time multi-model workloads including ALPR and vehicle recognition

Jetson AGX Orin (32 GB)

  • Up to ~64 GB LPDDR5, Ampere GPU with 2,048 CUDA cores (~275 TOPS INT8), built-in DL accelerators 
  • Ideal for multi-camera setups, high-speed plate tracking, and advanced analytics onboard vehicle units

📸 Recommended Camera Modules

Selecting a high-fidelity camera with low motion blur and high frame-rate is key for high-speed vehicle capture.

e-CAM24_CUNX (e-con Systems)

  • Sensor: ON Semiconductor AR0234 1/2.6” global shutter
  • Supports up to 120 FPS at 1080p resolution for motion-free capture of fast-moving vehicles 

XIMEA CMV4000 (PCIe)

  • Sensor: CMOSIS CMV4000, 4.2 MP, global shutter, Near‑Infrared (NIR) sensitive
  • PCIe interface enables direct memory transfer with minimal CPU overhead and low latency—ideal for real-time edge processing  



📊 Summary Table

ComponentSpecificationPerformance Benefit
Jetson Nano (4 GB)Maxwell GPU, 128 cores, ~4.8 W usage~14 FPS OCR (optimized models), cost-effective
Jetson Xavier NXVolta GPU, 21 TOPS INT8, compact form factorSupports multiple camera streams and vehicle classification
Jetson Orin Nano/NXAmpere GPU, up to 40 TOPS INT8, 7–15 W TDPHigh throughput ALPR & MMR in compact systems
Jetson AGX OrinAmpere GPU ~275 TOPS INT8, expansive I/OHigh-throughput inference on multiple cameras/inference pipes
e-CAM24_CUNX camera1080p @120 FPS global shutter CMOS sensorGreat for minimal motion blur when vehicles pass at speed
XIMEA CMV4000 camera4.2 MP PCIe NIR cameraNear real-time capture with very low latency

✅ Which Setup for Your Use Case?

  • Mobile Police Vehicle (fast-moving, limited space): Use Jetson Orin Nano (or Xavier NX) with e-CAM24_CUNX for high-framerate plate capture under motion.
  • Infrastructure Traffic Camera (fixed, high-res): Pair Jetson AGX Orin with high-res global shutter PCIe modules (XIMEA CMV4000) for superior capture, classification and tracking.
  • Budget / Low-power Deployments: Jetson Nano series with USB or MIPI camera modules is acceptable for lower vehicle throughput or slower areas.

🔍 ALPR Model Performance Example:

The Light‑Edge model deployed via TensorRT achieves 14 FPS at ~4.8W on Jetson Nano, delivering over 90% mAP while doubling speed‑per‑watt compared to earlier models  .


🔚 Key Takeaways

  • Device & Camera Matching: Select a Jetson variant that supports your required frame rate, inference load, environmental constraints, and power budget.
  • Strobe/IR Illumination: Consider NIR sensitive or global shutter cameras for low-light or night scenarios.
  • Model Optimization: Use INT8 quantization (TensorRT) and lightweight models for real-time ALPR performance on constrained devices.