Skip to content

Commit

Permalink
Last pre-release checks
Browse files Browse the repository at this point in the history
  • Loading branch information
lukassnoek committed Jan 10, 2023
1 parent 9edc6a7 commit 7249e6d
Show file tree
Hide file tree
Showing 15 changed files with 184 additions and 174 deletions.
38 changes: 29 additions & 9 deletions docs/getting_started/installation.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Medusa installation

Medusa is a Python package which works with Python versions 3.9 and above. We recommend
using Python version 3.9. Moreover, we strongly recommend to install the `medusa` package
Medusa is a Python package which works with Python version 3.9 and on Linux, Windows,
and Mac (except Mac M1/M2). Moreover, we strongly recommend to install the `medusa` package
in a separate environment, using for example [conda](https://anaconda.org/anaconda/conda).
If you'd use *conda*, you can create a new environment named "medusa" with python 3.9
If you'd use *conda*, you can create a new environment named "medusa" with Python 3.9
as follows:

```console
Expand All @@ -18,8 +18,13 @@ conda activate medusa

The next step is to install Medusa. Medusa actually offers two version of the package:
`medusa` and `medusa-gpu`, where the latter can be used instead of the former if you
have access to an NVIDIA GPU (and CUDA version 11.6). When you're not sure whether
you have access to an appropriate GPU, install the regular `medusa` package.
have access to an NVIDIA GPU (and CUDA version 11.6). Actually, `medusa-gpu` can also
be installed and used on systems without a GPU, but the installation is noticeably
larger (~2GB, instead of 300MB for the CPU version). When you're not sure whether
you have access to an appropriate GPU, we recommend installing the regular `medusa` package.

To install Medusa, run one of the commands listed below in your terminal (with the right
environment activated):

`````{tab-set}
Expand All @@ -37,18 +42,33 @@ pip install https://github.com/medusa-4D/medusa/releases/download/v0.0.3/medusa_
`````

At this point, `medusa` can be used, but only the Mediapipe reconstruction model can be
used. To be able to use the FLAME-based reconstruction models such as DECA, EMOCA, and
```{note}
While installing Python packages/wheels from other locations than PyPI is generally
discouraged, Medusa actually hosts its builds in its own Github repository (as you can
see in the install commands above). The reason for doing so (instead of on PyPI) is that
Medusa depends on a specific version of [PyTorch](https://pytorch.org/), which itself
is not available on PyPI (only as a wheel). Listing non-PyPI dependencies in packages
is not permitted by PyPI, which is why Medusa wheels are hosted on Github.
If you want to build Medusa yourself, you can clone the repository and run the
`build_wheels` script, which will create a directory `dist` with two wheel files
(one for `medusa` and one for `medusa-gpu`).
```

At this point, `medusa` can be used, but only the Mediapipe reconstruction model will be
available. To be able to use the FLAME-based reconstruction models such as DECA, EMOCA, and
Spectre, you need to download some additional data. Importantly, before you do, you need
to [register](https://flame.is.tue.mpg.de/register.php) on the [FLAME website](https://flame.is.tue.mpg.de/index.html)
and accept their [license terms](https://flame.is.tue.mpg.de/modellicense.html).

After creating an account, you can download all external data with the
`medusa_download_ext_data` command. To download all data to new directory
(medusa_ext_data), you'd run:
(default location: `~/.medusa_ext_data`), you'd run:

```console
medusa_download_ext_data --directory medusa_ext_data --username your_flame_username --password your_flame_passwd
```

After all data has been downloaded (~1.8GB), all Medusa functionality should be available!
where `your_flame_username` and `your_flame_passwd` are the username and password associated
with the account you created on the FLAME website. After all data has been downloaded
(~1.8GB), all Medusa functionality should be available!
29 changes: 14 additions & 15 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,25 +6,24 @@
![coverage](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/lukassnoek/cb6da52c965ec24f136b74a1ebad1964/raw/medusa_interrogate_badge.json)
![Python](https://img.shields.io/badge/python-3.9-blue.svg)

Medusa is a Python toolbox to perform 4D face reconstruction and analysis. You can use it
to reconstruct a series of 3D meshes of (moving) faces from video files: one 3D mesh for
each frame of the video (resulting in a "4D" representation of facial movement). In
addition to functionality to reconstruct faces, Medusa also contains functionality to
preprocess and visualize the resulting 4D reconstructions.
Medusa is Python toolbox for face image and video analysis. It offers tools for face
detection, alignment, rendering, and most importantly, *4D reconstruction*.
Using state-of-the-art 3D reconstruction models, Medusa can track and reconstruct faces
in videos (one 3D mesh per face, per frame) and thus provide a way to automatically
measure and quantify face movement as 4D signals.

More specifically, Medusa allows you to reconstruct, preprocess, and analyze
frame-by-frame time series of 3D faces from videos. The data that Medusa outputs is
basically a set of 3D points ("vertices"), which together represent face shape.
Medusa then processes these points in a similar way that fMRI or EEG/MEG software
processes voxels or sensors, but instead of representing "brain activity", it represents
face movement! Medusa makes relatively few assumptions as to how you want
to (further) analyze the face and just returns the raw set of vertices. For some ideas on
In Medusa, 4D reconstruction data is represented as a series of 3D meshes. Each mesh
describes the face shape at a particular frame in the video, and the changes in the
meshes over time thus decribe facial *movement* (including expression) quantitatively
and dynamically. Medusa makes relatively few assumptions as to how you want to (further)
analyze the face and just returns the raw set of vertices. For some ideas on
how to analyze such data, check out the [analysis tutorials](tutorials/analysis) (WIP).

## Documentation overview

On this website, you can find general information about Medusa (such as how to [install](getting_started/installation)
and [cite](getting_started/citation) it), as well as several tutorials
and details on Medusa's [command-line interface](api/cli) and [Python interface](api/python).
On this website, you can find general information about Medusa (such as how to
[install](getting_started/installation) and [cite](getting_started/citation) it), as
well as several tutorials and details on Medusa's [command-line interface](api/cli) and
[Python interface](api/python).

A great way to get more familiar with the package is to check out the [quickstart](getting_started/quickstart)!
16 changes: 8 additions & 8 deletions medusa/containers/results.py
Original file line number Diff line number Diff line change
Expand Up @@ -308,23 +308,23 @@ def visualize(
# BELOW: OLD CODE TO CREATE BOUNDING BOX FROM CROPPED IMAGES
# bbox_crop = torch.tensor([[0, 0], [0, h-1], [h-1, w-1], [0, w-1]], dtype=torch.float32, device=self.device)
# bbox_crop = bbox_crop.repeat(b, 1, 1)
# crop_mats = torch.inverse(self.crop_mats[idx])
# bbox = transform_points(crop_mats, bbox_crop)
# crop_mat = torch.inverse(self.crop_mat[idx])
# bbox = transform_points(crop_mat, bbox_crop)

# Check for landmarks (`lms`), which we'll draw if available
if hasattr(self, "lms"):
lms = self.lms[det_idx]

if show_cropped:
# Need to crop the original images!
crop_mats = self.crop_mats[det_idx]
crop_mat = self.crop_mat[det_idx]
img = warp_affine(
img.unsqueeze(0).float(), crop_mats[:, :2, :], crop_size
img.unsqueeze(0).float(), crop_mat[:, :2, :], crop_size
)
img = img.to(torch.uint8).squeeze(0)

# And warp the landmarks to the cropped image space
lms = transform_points(crop_mats, lms)
lms = transform_points(crop_mat, lms)

# TODO: scale radius
img = draw_keypoints(img, lms, colors=(0, 255, 0), radius=2)
Expand All @@ -334,11 +334,11 @@ def visualize(
if show_cropped:
template_ = template.unsqueeze(0)
else:
crop_mats = torch.inverse(self.crop_mats[det_idx])
crop_mat = torch.inverse(self.crop_mat[det_idx])
template_ = template.repeat(lms.shape[0], 1, 1).to(
crop_mats.device
crop_mat.device
)
template_ = transform_points(crop_mats, template_)
template_ = transform_points(crop_mat, template_)

img = draw_keypoints(img, template_, colors=(0, 0, 255), radius=1.5)
img = img.to(self.device)
Expand Down
9 changes: 5 additions & 4 deletions medusa/crop/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
"""Top-level module with two main crop models:
* ``LandmarkAlignCropModel``
* ``LandmarkBboxCropModel``
* ``AlignCropModel``
* ``BboxCropModel``
"""

from .align_crop import LandmarkAlignCropModel
from .bbox_crop import LandmarkBboxCropModel
from .align_crop import AlignCropModel
from .bbox_crop import BboxCropModel
14 changes: 7 additions & 7 deletions medusa/crop/align_crop.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
The coordinates are relative to an image of size 112 x 112."""


class LandmarkAlignCropModel(BaseCropModel):
class AlignCropModel(BaseCropModel):
"""Cropping model based on functionality from the ``insightface`` package,
as used by MICA (https://github.com/Zielon/MICA).
Expand All @@ -52,7 +52,7 @@ class LandmarkAlignCropModel(BaseCropModel):
To crop an image to be used for MICA reconstruction:
>>> from medusa.data import get_example_frame
>>> crop_model = LandmarkAlignCropModel()
>>> crop_model = AlignCropModel()
>>> img = get_example_frame() # path to jpg image
>>> out = crop_model(img)
"""
Expand Down Expand Up @@ -90,7 +90,7 @@ def __call__(self, imgs):
-------
out_crop : dict
Dictionary with cropping outputs; includes the keys "imgs_crop" (cropped
images) and "crop_mats" (3x3 crop matrices)
images) and "crop_mat" (3x3 crop matrices)
"""
# Load images here instead of in detector to avoid loading them twice
imgs = load_inputs(
Expand All @@ -100,17 +100,17 @@ def __call__(self, imgs):
out_det = self._det_model(imgs)

if out_det.get("conf", None) is None:
return {"imgs_crop": None, "crop_mats": None, **out_det}
return {"imgs_crop": None, "crop_mat": None, **out_det}

# Estimate transform landmarks -> template landmarks
crop_mats = estimate_similarity_transform(
crop_mat = estimate_similarity_transform(
out_det["lms"], self.template, estimate_scale=True
)
imgs_stacked = imgs[out_det["img_idx"]]
imgs_crop = warp_affine(
imgs_stacked, crop_mats[:, :2, :], dsize=self.output_size
imgs_stacked, crop_mat[:, :2, :], dsize=self.output_size
)

out_crop = {"imgs_crop": imgs_crop, "crop_mats": crop_mats, **out_det}
out_crop = {"imgs_crop": imgs_crop, "crop_mat": crop_mat, **out_det}

return out_crop
12 changes: 6 additions & 6 deletions medusa/crop/bbox_crop.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
from .base import BaseCropModel


class LandmarkBboxCropModel(BaseCropModel):
class BboxCropModel(BaseCropModel):
"""A model that crops an image by creating a bounding box based on a set of
face landmarks.
Expand Down Expand Up @@ -107,7 +107,7 @@ def __call__(self, imgs):
-------
out_crop : dict
Dictionary with cropping outputs; includes the keys "imgs_crop" (cropped
images) and "crop_mats" (3x3 crop matrices)
images) and "crop_mat" (3x3 crop matrices)
"""
# Load images here instead of in detector to avoid loading them twice

Expand All @@ -118,7 +118,7 @@ def __call__(self, imgs):
out_det = self._detector(imgs)

if out_det.get("conf", None) is None:
return {**out_det, "imgs_crop": None, "crop_mats": None}
return {**out_det, "imgs_crop": None, "crop_mat": None}

n_det = out_det["lms"].shape[0]
bbox = out_det["bbox"]
Expand Down Expand Up @@ -173,17 +173,17 @@ def __call__(self, imgs):
device=self.device,
)
dst = dst.repeat(n_det, 1, 1)
crop_mats = estimate_similarity_transform(
crop_mat = estimate_similarity_transform(
bbox[:, :3, :], dst, estimate_scale=True
)

# Finally, warp the original images (uncropped) images to the final
# cropped space
imgs_crop = warp_affine(imgs_stack, crop_mats[:, :2, :], dsize=(h_out, w_out))
imgs_crop = warp_affine(imgs_stack, crop_mat[:, :2, :], dsize=(h_out, w_out))
out_crop = {
**out_det,
"imgs_crop": imgs_crop,
"crop_mats": crop_mats,
"crop_mat": crop_mat,
"lms": lms,
}

Expand Down
Loading

0 comments on commit 7249e6d

Please sign in to comment.