Vivek Sharma, Makarand Tapaswi, and Rainer Stiefelhagen
In IEEE International Conference on Computer Vision (ICCV) workshop on Large Scale Holistic Video Understanding, 2019
Demo:
$ from TCBP import TCBP
$ import torch
$ data = torch.rand([10,8192,4,1,1])
$ tcbp = TCBP(input_dim1=8192, input_dim2=8192,output_dim=512, temporal_window=4, spat_x=1, spat_y=1)
$ tcbp_representation = tcbp(data,data)
$ tcbp_representation.shape
$ ---> torch.Size([10, 512])
If you find the code and datasets useful in your research, please cite:
@inproceedings{tcbp,
author = {Sharma, Vivek and Tapaswi, Makarand and Stiefelhagen, Rainer},
title = {Deep Multimodal Feature Encoding for Video Ordering},
booktitle = {IEEE ICCV Workshop on Large Scale Holistic Video Understanding},
year = {2019}
}