Aerial Object Detection Live Stream
I’m excited to share with you my real-time, AI-driven object detection system that monitors the skies for wildlife, aircraft, meteorites and hopefully UFOs. I designed this system with a two-stage detection pipeline to ensure I capture as much relevant activity as possible, while continuously improving its performance through ongoing training.
Technical Details
—————————
Hardware: Raspberry Pi 5 with the Hailo AI Acceleration Module
Model: YOLOv8M (Data labeled using Roboflow and trained on Google Colab with an A100 GPU)
Video Source: A Reolink security camera mounted on my roof, streaming via RTSP
—————————
Data Handling & Training
—————————
1. Video Capture & Preprocessing
I capture high-resolution video (3840×2160) from the security camera and downscale it to 1080×720. This resizing helps optimize processing speed while maintaining sufficient detail for detection.
2. Moving Object Detection
I start by running a dedicated model that scans the video for any movement. This stage is tuned for high sensitivity (high recall) so I can capture every potential event—even if that means occasionally picking up non-target phenomena like clouds, celestial bodies, Northern Lights, swaying tree branches, or snowfall.
3. Object Detection & Classification
When movement is detected, a secondary object detection model takes over to classify the object. For targets such as birds and aircraft, I typically use a detection threshold of 55%.
At a 55% threshold: The model maintains high precision (fewer false positives) but doesn’t catch every possible object (moderate recall). This means most of the detections it makes (e.g., for planes, meteorites, and wildlife) are correct, but it might miss some objects that don’t reach the 55% confidence score.
Lowering the threshold to 30%: The model picks up more objects that were previously missed (increasing recall) but also introduces more false positives (lowering precision). You’ll likely see branches, clouds, snow, or rain classified as birds or planes more often at this setting.
About Precision and Recall:
Precision is the fraction of detections that are actually correct:
true-positives / (true-positives + false-positives)
Recall is the fraction of actual positive cases that the model successfully detects: true-positives /(true-positives + false-negatives).
As you lower the confidence threshold, the model becomes less “strict,” so it flags more objects as positives (increasing recall), but you’ll generally see more false positives (decreasing precision).
4. Data Segmentation & Labeling
Segmentation: I group detections by 2-minute time frames to prevent data leakage between samples.
Dataset Split: The collected data is organized into training (88%), validation (10%), and test (2%) sets.
Labeling: I use a combination of automated tools and manual review to accurately label the data, ensuring reliable training and evaluation.
Thank you for joining me! I hope this clarifies how this process works and feel free to ask questions or share your thoughts.
source