Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

text-to-pose: Compose signs #4

Open
AmitMY opened this issue May 16, 2022 · 0 comments
Open

text-to-pose: Compose signs #4

AmitMY opened this issue May 16, 2022 · 0 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@AmitMY
Copy link
Contributor

AmitMY commented May 16, 2022

Given a sequence of signs: (this is FSW, SignWriting)

AS14c20S27106M518x529S14c20481x471S27106503x489 AS18701S1870aS2e734S20500M518x533S1870a489x515S18701482x490S20500508x496S2e734500x468 S38800464x496

image

We could split the signs to the first sign and the second.
There needs to be an inference function that can "animate" the first sign, starting from a neutral position, then use the last frame from the first sign to generate the second sign (ideally could use the entire first sign, but context is expensive for memory).

This generation process should have a feature that disallows the iterative process to edit the first frame - so it does not change the appearance of the person over few iterations.


Notes:

  1. Training on a dictionary-like scenario, this is a bit weird, because we try to predict "go from neutral position, to sign, to neutral position" and so the skeleton would always get back to neutral position between signs. This could either be "cropped" with a heuristic we already have, or not be a problem if using data like the dgs corpus. Another option would be to not use the last frame, but last N frames, reversed, such that the video starts from moving the hands up to the previous position.
  2. Length of each sign should be calculated independently as mu and std, then, find the best length to match a sentence length, if one was provided (extremely important for subtitling) - text-to-pose: Predict length as distribution #1
@AmitMY AmitMY added enhancement New feature or request help wanted Extra attention is needed labels May 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant