Audio Visual Tutorials

AI model predicts human attention in 360-degree videos using both sound and vision

Virtual reality (VR) experiences and 360-degree videos are transforming viewers from passive observers into active ...

Spiking Tucker Fusion Transformer for Audio-Visual Zero-Shot Learning

Abstract: The spiking neural networks (SNNs) that efficiently encode temporal sequences have shown great potential in extracting audio-visual joint feature representations. However, coupling SNNs ...

IEEE

A Generative Approach to Audio-Visual Generalized Zero-Shot Learning: Combining Contrastive and Discriminative Techniques

Abstract: Audio-visual generalized zero-shot learning (AV-GZSL) for video classification is a task where the model learns to identify unseen video classes from multimodal audio-visual inputs. This is ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

AI model predicts human attention in 360-degree videos using both sound and vision

Spiking Tucker Fusion Transformer for Audio-Visual Zero-Shot Learning

A Generative Approach to Audio-Visual Generalized Zero-Shot Learning: Combining Contrastive and Discriminative Techniques

Trending now