Projects AI

Sentry — Edge Vision Camera

A privacy-first security camera running quantized YOLO person detection on a Raspberry Pi 5 with a Hailo-8L accelerator — every frame stays on the device.

Active since Sep 2025 #computer-vision#edge-ai#raspberry-pi#yolo
62 ms
Inference latency
27
Detection FPS
94 %
False positives cut
0
Cloud dependencies

Sentry exists because every consumer security camera I looked at wanted a subscription and a cloud account to tell me someone was in my own driveway. The hardware to do person detection locally has been cheap for a while — a Raspberry Pi 5 plus the Hailo-8L AI kit is under $120 — so I built the camera I actually wanted: detection on-device, events on my LAN, footage that never leaves the house.

Architecture

A Pi Camera Module 3 feeds 1536×864 frames into a capture process. Frames are letterboxed to 640×640 and handed to the Hailo-8L, which runs an INT8-quantized YOLOv8-nano. Detections go onto an internal queue; a filter stage decides whether they constitute a real event; confirmed events write an H.264 clip (with a 3-second pre-roll from a rolling buffer) and a row into SQLite. A FastAPI dashboard on the same Pi serves event history, clip playback, and a live preview. Nothing opens an outbound connection.

Quantization trade-offs

The stock FP32 model was never an option — the Hailo wants INT8, and that’s where the interesting failures live. My first calibration set was 200 frames of daytime footage, and recall on small, distant people fell from 0.89 to 0.61 after quantization. The fix was boring but effective: a calibration set stratified across day, dusk, and IR night frames, plus per-channel quantization on the neck layers. Recall recovered to 0.85, and inference sits at 62 ms end-to-end including pre/post-processing — the accelerator itself does its part in about 11 ms.

Killing false positives with temporal voting

A single frame is a terrible witness. Car headlights sweeping the fence produced confident phantom “persons” for one or two frames at a time. Instead of raising the confidence threshold (which also drops real detections), Sentry requires agreement across a sliding window:

def confirm(self, hit: bool) -> bool:
    self.window.append(hit)          # deque(maxlen=15), ~0.5 s at 27 fps
    return sum(self.window) >= 9     # 9-of-15 vote triggers an event

That one deque cut false positives by 94% in a two-week log comparison, at the cost of roughly half a second of detection latency — a trade I’d make every time for a camera that only alerts when it matters.

The dashboard now shows about a month of events, and I’ve stopped checking it obsessively, which I think is the actual success metric for a security camera.

Development timeline

  1. 2025-09

    First inference

    Got the Hailo runtime executing a stock YOLOv8n at 30 fps on test video.

  2. 2025-11

    Quantization rabbit hole

    INT8 calibration tanked recall on small people until I fixed the calibration set.

  3. 2026-02

    Dashboard shipped

    FastAPI event server with clip storage and a live MJPEG preview.

  4. 2026-05

    Temporal voting

    False positives from headlights and cats dropped 94% overnight.