-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata field can't handle strings longer than 255 characters #23
Comments
The workaround was to replace: all_metadata.append((image, {'caption': caption, 'score': score})) with: all_metadata.append((image, {'caption': caption[:255], 'score': score})) |
Interresting, would you consider it ok to clip the content of the value? |
I think there's a fine line between metadata and labels 😄 My conceived use case was to be able to filter the images based on keywords in the caption. i.e. look for all images with the word "squirrel" in the caption. If you changed this to a blob type, I probably wouldn't be able to do that, right? Another option I, as a user, would have, would be to run some NLP functions to pull out nouns, verbs, and adjectives from the caption and then just upload those as metadata. |
What's the difference backend performance wise? Not wasting time I'm a bit wary about introducing blobs though, because THEN you can actually build a "versioning solution on top of Data Engine", e.g. have a .json file and the actual object being labeled is in metadata on said file for some reason, and it sounds like huge misuse and will probably tank the performance for real. |
That's legit, I think it would require from us to have a dedicated kind of indexing over text blobs for that use case. Using "Contains" query to find datapoints in big datasets will probably be too slow.
Yes mostly. Why would it tank performance? |
I tried to upload image captions as a metadata point. The idea being I could then filter the dataset based on the contents of the captions. I ran into an error:
This was the code:
The text was updated successfully, but these errors were encountered: