Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to capture precise publication time for YouTube video metadata #1343

Open
Esparon1 opened this issue Apr 20, 2024 · 12 comments · May be fixed by #2114
Open

Unable to capture precise publication time for YouTube video metadata #1343

Esparon1 opened this issue Apr 20, 2024 · 12 comments · May be fixed by #2114
Labels
easy Easy difficulty enhancement New feature or request good first issue Good for newcomers

Comments

@Esparon1
Copy link
Contributor

🐛 Describe the bug

Embedchain currently captures the publish date of YouTube videos but omits the precise publication time.
The metadata stored for each video includes the date in the format YYYY-MM-DD 00:00:00, which defaults to a time of 00:00:00, indicating that the time component is not being processed or stored.The issue with just the date is because of time zones.

Example of the issue:
For a video published at 3 PM on April 15, 2024, Embedchain stores the publish date as 2024-04-15 00:00:00
Whereas it should be in ISO 8601 format ( example: 2024-04-15T14:30:00Z )
This format allows the publication time to be used globally without confusion about time zone differences.

@jsjeon-um
Copy link

Can I work on this?

@Dev-Khant Dev-Khant added enhancement New feature or request good first issue Good for newcomers easy Easy difficulty labels Jun 2, 2024
@MoizKhuzema
Copy link

@Dev-Khant Has the migration from pytube to yt-dlp been merged? Because if so, yt-dlp returns metadata such as publication time and this will be a very easy fix then. I could do it

@Dev-Khant
Copy link
Member

@Dev-Khant Has the migration from pytube to yt-dlp been merged? Because if so, yt-dlp returns metadata such as publication time and this will be a very easy fix then. I could do it

No sorry it has not been merged yet. But please feel free to open a PR for this using yt-dlp.

@MoizKhuzema
Copy link

MoizKhuzema commented Jun 19, 2024

@Dev-Khant Has the migration from pytube to yt-dlp been merged? Because if so, yt-dlp returns metadata such as publication time and this will be a very easy fix then. I could do it

No sorry it has not been merged yet. But please feel free to open a PR for this using yt-dlp.

@Dev-Khant, I was writing the script for yt-dlp migration and realized that although the youtube dataloader does return meta data for videos, the LLM does not seem to have access to the video metadata. It only seems to have access to the video context.

I ran these 4 queries:
1- what is the publication date of the video?
2- do you have any information regarding this video's metadata?
3- what is the context you were provided with?
4- what is the name of the youtube channel and the length of the video?

here are the results:
Capture

I was wondering if this is by design or an issue?

@Dev-Khant
Copy link
Member

@MoizKhuzema If you use citations==True in app.query() you will get metadata which will return you all the metadata regarding that video. It will include publish date, author, title etc.

So it's by design that you get metadata separately.

@MoizKhuzema
Copy link

@MoizKhuzema If you use citations==True in app.query() you will get metadata which will return you all the metadata regarding that video. It will include publish date, author, title etc.

So it's by design that you get metadata separately.

Understood, thanks

@02shanks
Copy link

@Dev-Khant is this issue open for contribution?

@Dev-Khant
Copy link
Member

@MoizKhuzema Please let us know if you are working on this or else @02shanks would like to pick this up. Thanks.

@shivani-developer
Copy link

I would like to pick this up as I see it's still open.

@burnerlee
Copy link

@Dev-Khant, can i pick this up?

@Dev-Khant
Copy link
Member

@burnerlee Sure feel free to work on this!

@hensikavar
Copy link

@Dev-Khant may i pick up this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
easy Easy difficulty enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants