You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We could split the signs to the first sign and the second.
There needs to be an inference function that can "animate" the first sign, starting from a neutral position, then use the last frame from the first sign to generate the second sign (ideally could use the entire first sign, but context is expensive for memory).
This generation process should have a feature that disallows the iterative process to edit the first frame - so it does not change the appearance of the person over few iterations.
Notes:
Training on a dictionary-like scenario, this is a bit weird, because we try to predict "go from neutral position, to sign, to neutral position" and so the skeleton would always get back to neutral position between signs. This could either be "cropped" with a heuristic we already have, or not be a problem if using data like the dgs corpus. Another option would be to not use the last frame, but last N frames, reversed, such that the video starts from moving the hands up to the previous position.
Length of each sign should be calculated independently as mu and std, then, find the best length to match a sentence length, if one was provided (extremely important for subtitling) - text-to-pose: Predict length as distribution #1
The text was updated successfully, but these errors were encountered:
Given a sequence of signs: (this is FSW, SignWriting)
We could split the signs to the first sign and the second.
There needs to be an inference function that can "animate" the first sign, starting from a neutral position, then use the last frame from the first sign to generate the second sign (ideally could use the entire first sign, but context is expensive for memory).
This generation process should have a feature that disallows the iterative process to edit the first frame - so it does not change the appearance of the person over few iterations.
Notes:
mu
andstd
, then, find the best length to match a sentence length, if one was provided (extremely important for subtitling) - text-to-pose: Predict length as distribution #1The text was updated successfully, but these errors were encountered: