Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Development Technology

Grounding Dino for open set object detection



This video talks about Grounding Dino – Dino’s “open set” object detection brother that allows to detect objects from novel categories zero shot, as well as detect objects using referring expressions like “the lion most to the right”.
This video is part of broader series: Modern Object Detection – from YOLO to Transformers https://www.youtube.com/playlist?list=PL1HdfW5-F8AQlPZCJBq2gNjERTDEAl8v3. Check out this playlist for other object detection videos, including source code reads for Grounding Dino predecessors – DETR, Deformable DETR, DAB DETR, DN DETR and Dino.
Important links:
– Original paper: https://arxiv.org/pdf/2303.05499
– GLIP paper – another open set object detection model with some explanations regarding training setup and datasets https://openaccess.thecvf.com/content/CVPR2022/papers/Li_Grounded_Language-Image_Pre-Training_CVPR_2022_paper.pdf

00:00 – Intro
02:08 – Prerequisites
05:12 – Dino overview
13:05 – Bert overview
15:39 – Grounding Dino Inputs and Outputs
21:37 – Sub Sentence Level Attention Mask
24:29 – Cross Modality Feature Enhancer
30:06 – Language Guided Query Selection
35:32 – Cross Modality Decoder
39:19 – Token Output
41:00 – Training Data
48:05 – Results & Next Up

source

Author

MQ

Leave a comment

Your email address will not be published. Required fields are marked *