Object Detection with Transformers (DETR)

_ 28 June, 2025_ MQ_ 0 Comments

Object Detection with Transformers (DETR)

The content is also available as text: https://github.com/adensur/blog/blob/main/computer_vision_zero_to_hero/12_detr/Readme.md

This video is part of my “Modern Object Detection: from YOLO to transformers” series: https://www.youtube.com/playlist?list=PL1HdfW5-F8AQlPZCJBq2gNjERTDEAl8v3
It talks about DETR – first transformer-based object detector model that was aimed at simplifying the overall approach to object detection, making it single-stage and without hand-crafter components.
The video goes in detail through the following:
– Direct set prediction and bipartite matching loss
– What are transformers and attention
– How attention is used in DETR to form encoder-decoder
– Cool visualisations of attention masks

Useful links:
– Original paper: https://arxiv.org/pdf/2005.12872.pdf
– Cool post explaining positional encodings in detail: https://towardsdatascience.com/master-positional-encoding-part-i-63c05d90a0c3
– My video about YOLO algorithm: https://youtu.be/QHoAWDI8g_c
– My video about how ResNet model works: https://youtu.be/uztrVK1BhGw

00:00 – Intro
05:49 – Motivation behind DETR
08:50 – Direct Set Prediction
18:52 – Transformers and Attention
21:51 – Input to Transformers: Sequences
24:16 – Self Attention
37:14 – Cross Attention
44:02 – Positional Encoding
51:08 – Analysis
56:41 – Next Up

source

Author

Gallery

Contacts

Single Blog

Leave a comment Cancel reply

Gallery

Contacts

Single Blog

MQ

Cựu giám đốc nghiên cứu của Liverpool nói về việc dùng AI để tìm ra mẫu cầu thủ lý tưởng

Tinhte Rewards: Chơi minigame trúng quà, nói gì đó về tuổi Tinh tế của bạn

Leave a comment Cancel reply