Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Development Technology

Object Detection with Transformers (DETR)



The content is also available as text: https://github.com/adensur/blog/blob/main/computer_vision_zero_to_hero/12_detr/Readme.md

This video is part of my “Modern Object Detection: from YOLO to transformers” series: https://www.youtube.com/playlist?list=PL1HdfW5-F8AQlPZCJBq2gNjERTDEAl8v3
It talks about DETR – first transformer-based object detector model that was aimed at simplifying the overall approach to object detection, making it single-stage and without hand-crafter components.
The video goes in detail through the following:
– Direct set prediction and bipartite matching loss
– What are transformers and attention
– How attention is used in DETR to form encoder-decoder
– Cool visualisations of attention masks

Useful links:
– Original paper: https://arxiv.org/pdf/2005.12872.pdf
– Cool post explaining positional encodings in detail: https://towardsdatascience.com/master-positional-encoding-part-i-63c05d90a0c3
– My video about YOLO algorithm: https://youtu.be/QHoAWDI8g_c
– My video about how ResNet model works: https://youtu.be/uztrVK1BhGw

00:00 – Intro
05:49 – Motivation behind DETR
08:50 – Direct Set Prediction
18:52 – Transformers and Attention
21:51 – Input to Transformers: Sequences
24:16 – Self Attention
37:14 – Cross Attention
44:02 – Positional Encoding
51:08 – Analysis
56:41 – Next Up

source

Author

MQ

Leave a comment

Your email address will not be published. Required fields are marked *