Some questions about TBD-Baseline and detection/tracking performance #6

JingweiZhang12 · 2024-08-07T09:14:04Z

Thanks for your work and open-source code! I have some questions:

From my perspective, the differences between TBD-Baseline and ADA-Track are mainly whether the track-query & det-query perform self-attention simultaneously in the shared decoder layer and it mainly influences the detection performance, which fuses temporal information, and then it increases MOTA metric indirectly.
As we know, MUTR3D is based on MOTRv1. Did you try the tricks of MOTRv3, such as better assignment for enhancing detection performance, on MUTR3D?
Regarding the DETR3D/PETR detector, have you tested whether adding association layers has any impact on detection performance

I'd appreciate it if you could answer the above questions. @dsx0511

The text was updated successfully, but these errors were encountered:

dsx0511 · 2024-12-07T20:36:25Z

Hi @JingweiZhang12 , sorry for the late reply. Here are the answers to your questions:

The model architecture of ADA-Track is

for decoder_layer:
    [track_query, detection_query] = self_attention([track_query, detection_query]) # '[]' denotes concatenation 
    [track_query, detection_query] = cross_attention([track_query, detection_query], image_features)
    intermediate_track_boxes = MLP(track_query)
    intermediate_track_boxes = MLP(detection_query)
    detection_query, edge_features = edge_augmented_cross_attention(track_query, detection_query, edge_features) # track query: key, detection_query: query

The model architecture of TBD-Baseline is

for detection_decoder_layer:
    [track_query, detection_query] = self_attention([track_query, detection_query])
    [track_query, detection_query] = cross_attention([track_query, detection_query], image_features)
    intermediate_track_boxes = MLP(track_query)
    intermediate_track_boxes = MLP(detection_query)

for association_decoder_layer:
    detection_query, edge_features = edge_augmented_cross_attention(track_query, detection_query, edge_features)

Therefore, the differences between ADA-Track and TBD-Baseline do not only lie in self-attention. It also leads to differences in the association's performance: The learned association modules in ADA-Track fuse detection information with previous association results layer by layer, yielding a mutual optimization of both tasks. In contrast, the association layers of TBD-Baseline cannot influence the detection layers in the forward pass.

2/3. We have not tried it yet. But thank you for your advice and it is very valuable for our future work!

Hopefully, the response is not too late for you. And I'm looking forward to further discussion!

JingweiZhang12 changed the title ~~Some questions about TBD-Baseline~~ Some questions about TBD-Baseline and detection/tracking performance Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions about TBD-Baseline and detection/tracking performance #6

Some questions about TBD-Baseline and detection/tracking performance #6

JingweiZhang12 commented Aug 7, 2024 •

edited

Loading

dsx0511 commented Dec 7, 2024

Some questions about TBD-Baseline and detection/tracking performance #6

Some questions about TBD-Baseline and detection/tracking performance #6

Comments

JingweiZhang12 commented Aug 7, 2024 • edited Loading

dsx0511 commented Dec 7, 2024

JingweiZhang12 commented Aug 7, 2024 •

edited

Loading