About video fragments #61

sameerKgp · 2024-05-14T13:03:17Z

Hi thanks for providing the code of your work. In the code what is the video_fragment. Is it for the breakpoint mode? How to create these fragments? Also in the src/video_fragment, you have provided a clip from a different video (GOT) than the Cooking_cake one.

Espere-1119-Song · 2024-05-15T02:25:46Z

video_fragment stores the video clip read by the sliding window, and it will be created and automatically updated. Also I didn't find the GOT video, can u point out the exact path? We didn't upload Cooking_cake since it is too big to upload on Github.

sameerKgp · 2024-05-15T10:36:14Z

Thanks for the reply. The cooking_cake video I got from the link provided in 15th issue. The GOT video is src/video_fragment/output.mp4

HTD1016 · 2024-07-09T12:06:13Z

I still don't know how to create the video fragment if I use my own video. There're no such functions that I can found in "Class chat". Maybe in "global mode", video fragment is also the original video? That means I need to store the same video in "video fragment path" as in the "video path"??

Espere-1119-Song · 2024-07-09T22:02:36Z

You just need to choose one video as the initialized video fragment at the beginning, and the others video fragments will be created automatically.

HTD1016 · 2024-07-10T01:12:55Z

Thanks for the reply. I used the MovieChat package in PyPI (version 0.6.3), and I carefully checked the code in the package.
In /anaconda/envs/MovieChat/lib/python3.9/site-packages/MovieChat/models/chat_model.py:

for i in range(num_frames): 
    print(f"current processed frames: {i+1} / {num_frames}")
    video_fragment = self.parse_video_fragment(video_path=video_path, video_length=video_length, n_stage=i)         
    video_fragment, msg = self.load_video(
        video_path=fragment_video_path,
        n_frms=4, 
        height=224,
        width=224
    )
    video_fragment = self.vis_processor.transform(video_fragment) 
    video_fragment = video_fragment.unsqueeze(0).to(self.device)

where the function self.parse_video_fragment() is used for create the video fragment, then the next function self.load_video() can be able to read the video fragment in from fragment_video_path. But it can be seen from here that function self.parse_video_fragment() should save the video fragment locally.
Now take a look at the self.parse_video_fragment() function:

def parse_video_fragment(self, video_path, fragment_video_path, video_length, n_stage = 0):
    decord.bridge.set_bridge("torch")
    per_video_length = video_length / self.n_samples
    fragment_video = self.capture_video(video_path, per_video_length, n_stage)
    fragment_video.write_videofile(fragment_video_path)  # This code was added by me, as well as the parameter "fragment_video_path"
    return fragment_video

So I think there is a missing sentence of code here. After I added this sentence of code, the code can work normally. And I noticed that the author's code repository also provides a local version of MovieChat, which includes this sentence of code.
However, due to the time cost for the Moviepy to write videos, the inference time of the entire code also becomes very long

Espere-1119-Song · 2024-07-11T04:48:34Z

Thank you very much for discovering this issue. We will recheck our code and update the MovieChat package as soon as possible to resolve this problem.

ywh187 · 2024-09-02T09:23:08Z

for i in range(num_frames):
print(f"current processed frames: {i+1} / {num_frames}")
video_fragment = self.parse_video_fragment(video_path=video_path, video_length=video_length, n_stage=i)
video_fragment, msg = self.load_video(
video_path=fragment_video_path,
n_frms=4,
height=224,
width=224
)
video_fragment = self.vis_processor.transform(video_fragment)
video_fragment = video_fragment.unsqueeze(0).to(self.device)

I noticed that the video_fragment variable is assigned a value in line 3, but then immediately overwritten in line 4. It seems like the assignment in line 3 might be redundant since its value is not used before it's reassigned.

Espere-1119-Song · 2024-09-02T09:28:19Z

I understand what you mean. During implementation, we found that some versions of ffmpeg may not support initializing a blank video fragment, so we used an unrelated video clip for initialization.

allent4n · 2024-10-22T14:08:13Z

@HTD1016 You are just amazing!!!

oximi123 · 2024-10-29T08:00:34Z

video_fragment stores the video clip read by the sliding window, and it will be created and automatically updated. Also I didn't find the GOT video, can u point out the exact path? We didn't upload Cooking_cake since it is too big to upload on Github.

Hi, I have two little questions for these two hyperparameters in run_inference_qa_msvd.py:

MAX_INT = 8 N_SAMPLES = 32

According to my understanding, does the N_SAMPLES specify how many fragments (or sliding windows) will be created for each video, and the MAX_INT specify how many frames we will use for encoding as LLM input for each fragment/sliding window?

Espere-1119-Song · 2024-10-29T08:11:18Z

Sorry for the confusion. N_SAMPLES specifies how many fragments (or sliding windows) will be created for each video. However, MAX_INT is not utilized in the current implementation. In our code, the number of frames included within each sliding window corresponds to the length of the short-term memory window used for encoding.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About video fragments #61

About video fragments #61

sameerKgp commented May 14, 2024

Espere-1119-Song commented May 15, 2024

sameerKgp commented May 15, 2024

HTD1016 commented Jul 9, 2024

Espere-1119-Song commented Jul 9, 2024

HTD1016 commented Jul 10, 2024 •

edited

Loading

Espere-1119-Song commented Jul 11, 2024

ywh187 commented Sep 2, 2024

Espere-1119-Song commented Sep 2, 2024

allent4n commented Oct 22, 2024

oximi123 commented Oct 29, 2024

Espere-1119-Song commented Oct 29, 2024

About video fragments #61

About video fragments #61

Comments

sameerKgp commented May 14, 2024

Espere-1119-Song commented May 15, 2024

sameerKgp commented May 15, 2024

HTD1016 commented Jul 9, 2024

Espere-1119-Song commented Jul 9, 2024

HTD1016 commented Jul 10, 2024 • edited Loading

Espere-1119-Song commented Jul 11, 2024

ywh187 commented Sep 2, 2024

Espere-1119-Song commented Sep 2, 2024

allent4n commented Oct 22, 2024

oximi123 commented Oct 29, 2024

Espere-1119-Song commented Oct 29, 2024

HTD1016 commented Jul 10, 2024 •

edited

Loading