Search Paper
  • Home
  • Login
  • Categories
  • Post URL
  • Academic Resources
  • Contact Us

 

Saliency-Guided DETR for Moment Retrieval and Highlight Detection

google+
Views: 357                 

Author :  Aleksandr Gordeev, Vladimir Dokholyan, Irina Tolstykh, Maksim Kuprashevich

Affiliation :  SALUTEDEV

Country :  Uzbekistan

Category :  Artificial Intelligence

Volume, Issue, Month, Year :  -, -, October, 2024

Abstract :


Existing approaches for video moment retrieval and highlight detection are not able to align text and video features efficiently, resulting in unsatisfying performance and limited production usage. To address this, we propose a novel architecture that utilizes recent foundational video models designed for such alignment. Combined with the introduced Saliency-Guided Cross Attention mechanism and a hybrid DETR architecture, our approach significantly enhances performance in both moment retrieval and highlight detection tasks. For even better improvement, we developed InterVid-MR, a large-scale and high-quality dataset for pretraining. Using it, our architecture achieves state-of-the-art results on the QVHighlights, Charades-STA and TACoS benchmarks. The proposed approach provides an efficient and scalable solution for both zero-shot and fine-tuning scenarios in video-language tasks. The dataset and code will be published as open-source.

Keyword :  video moment retrieval, highlight detection, DETR, Cross Attention, transformer

URL :  https://arxiv.org/abs/2410.01615

User Name : mvkuprashevich
Posted 06-08-2025 on 09:46:21 AEDT



Related Research Work

  • Augmented And Synthetic Data In Artificial Intelligence
  • Nohumansrequired: Autonomous High-quality Image Editing Triplet Mining
  • Cerberusdet: Unified Multi-dataset Object Detection
  • Gigacheck: Detecting Llm-generated Content

About Us | Post Cfp | Share URL Main | Share URL category | Post URL
All Rights Reserved @ Call for Papers - Conference & Journals