Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added dai_type, updated encoding flags, and removed reverse_input_channels flag #48

Merged
merged 27 commits into from
Nov 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
fe026a7
fix type in DevOption description
ptoupas Nov 7, 2024
3dc04dc
add int4 into DataType
ptoupas Nov 7, 2024
ac6d322
fix logic and updated flags on default.yaml and NN Archive cases
ptoupas Nov 7, 2024
6ecf46c
fix edge case where only 'encoding.from' or 'encoding.to' is given wi…
ptoupas Nov 7, 2024
92e19e4
fix test_config tests and added two more regarding the encoding (from…
ptoupas Nov 7, 2024
c9926f9
fix extract_preprocessing method and set default values to mean=0 and…
ptoupas Nov 7, 2024
0a5dd13
create back the dai_type based on both the inp.data_type and inp.layout
ptoupas Nov 7, 2024
f7f9b0e
simplify a few conditions
ptoupas Nov 7, 2024
adcd98c
remove reverse_input_channels from rvc2 exporter and onnx_tools and f…
ptoupas Nov 8, 2024
15a3ce0
remove reverse_input_channels from hailo exporter
ptoupas Nov 8, 2024
93c3e94
update defaults.yaml file
ptoupas Nov 8, 2024
4033408
skip addition of mean and scale nodes to onnx model if mean=0 and sca…
ptoupas Nov 8, 2024
c8f2c87
add to the preprocessing block in modelconverter_config_to_nn and arc…
ptoupas Nov 8, 2024
7b729e1
revert a change done on _parse_values for mean and scale in case on v…
ptoupas Nov 8, 2024
a962291
fix edge case where inputs mean_values or scale_values is set to (sin…
ptoupas Nov 8, 2024
046446a
add note on README about CLI arguments when default stages names are …
ptoupas Nov 8, 2024
dbc009f
updated the Running ModelConverter section in the README file
ptoupas Nov 8, 2024
4d99f9d
added a section in README regarding the Encoding Configuration Flags
ptoupas Nov 8, 2024
230b1ff
fix logic on the creation of final NN Archive regarding the reverse_c…
ptoupas Nov 8, 2024
cbfdd23
Update README.md
ptoupas Nov 9, 2024
9a08c90
add encoding_mismatch property on InputConfig and added a unit test f…
ptoupas Nov 11, 2024
4e22070
remove check for concat node after split, since that is always the case
ptoupas Nov 11, 2024
f9bc726
add extra tests for output_nn_config_from_yaml and output_nn_config_f…
ptoupas Nov 11, 2024
0a8698c
add permutation in mean and scale values when encoding_mismatch is Tr…
ptoupas Nov 11, 2024
50f4968
Update README.md
ptoupas Nov 11, 2024
f1ecf6d
Update requirements.txt
ptoupas Nov 11, 2024
2a58c0a
Update .pre-commit-config.yaml
ptoupas Nov 11, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 67 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,18 @@ Convert your **ONNX** models to a format compatible with any generation of Luxon

## Table of Contents

- [MLOps - Compilation Library](#mlops---compilation-library)
- [ModelConverter - Compilation Library](#modelconverter---compilation-library)
- [Status](#status)
- [Table of Contents](#table-of-contents)
- [Installation](#installation)
- [System Requirements](#system-requirements)
- [Before You Begin](#before-you-begin)
- [Instructions](#instructions)
- [GPU Support](#gpu-support)
- [Running ModelConverter](#running-modelconverter)
- [Encoding Configuration Flags](#encoding-configuration-flags)
- [YAML Configuration File](#yaml-configuration-file)
- [NN Archive Configuration File](#nn-archive-configuration-file)
- [Sharing Files](#sharing-files)
- [Usage](#usage)
- [Examples](#examples)
Expand Down Expand Up @@ -101,9 +106,58 @@ To enable GPU acceleration for `hailo` conversion, install the [Nvidia Container

## Running ModelConverter

Configuration for the conversion predominantly relies on a `yaml` config file. For reference, see [defaults.yaml](shared_with_container/configs/defaults.yaml) and other examples located in the [shared_with_container/configs](shared_with_container/configs) directory.
There are two main ways to execute configure the conversion process:

However, you have the flexibility to modify specific settings without altering the config file itself. This is done using command line arguments. You provide the arguments in the form of `key value` pairs. For better understanding, see [Examples](#examples).
1. **YAML Config File (Primary Method)**:
The primary way to configure the conversion is through a YAML configuration file. For reference, you can check [defaults.yaml](shared_with_container/configs/defaults.yaml) and other examples located in the [shared_with_container/configs](shared_with_container/configs) directory.
1. **NN Archive**:
ptoupas marked this conversation as resolved.
Show resolved Hide resolved
Alternatively, you can use an [NN Archive](https://rvc4.docs.luxonis.com/software/ai-inference/nn-archive/#NN%20Archive) as input. An NN Archive includes a model in one of the supported formats—ONNX (.onnx), OpenVINO IR (.xml and .bin), or TensorFlow Lite (.tflite)—alongside a `config.json` file. The config.json file follows a specific configuration format as described in the [NN Archive Configuration Guide](https://rvc4.docs.luxonis.com/software/ai-inference/nn-archive/#NN%20Archive-Configuration).

**Modifying Settings with Command-Line Arguments**:
In addition to these two configuration methods, you have the flexibility to override specific settings directly via command-line arguments. By supplying `key-value` pairs in the CLI, you can adjust particular settings without explicitly altering the config files (YAML or NN Archive). For further details, refer to the [Examples](#examples) section.

### Encoding Configuration Flags

In the conversion process, you have options to control the color encoding format in both the YAML configuration file and the NN Archive configuration. Here’s a breakdown of each available flag:

#### YAML Configuration File

The `encoding` flag in the YAML configuration file allows you to specify color encoding as follows:

- **Single-Value `encoding`**:
Setting encoding to a single value, such as *"RGB"*, *"BGR"*, *"GRAY"*, or *"NONE"*, will automatically apply this setting to both `encoding.from` and `encoding.to`. For example, `encoding: RGB` sets both `encoding.from` and `encoding.to` to *"RGB"* internally.
- **Multi-Value `encoding.from` and `encoding.to`**:
Alternatively, you can explicitly set `encoding.from` and `encoding.to` to different values. For example:
```yaml
encoding:
from: RGB
to: BGR
```
This configuration specifies that the input data is in RGB format and will be converted to BGR format during processing.

> [!NOTE]
> If the encoding is not specified in the YAML configuration, the default values are set to `encoding.from=RGB` and `encoding.to=BGR`.

> [!NOTE]
> Certain options can be set **globally**, applying to all inputs of the model, or **per input**. If specified per input, these settings will override the global configuration for that input alone. The options that support this flexibility include `scale_values`, `mean_values`, `encoding`, `data_type`, `shape`, and `layout`.

#### NN Archive Configuration File

In the NN Archive configuration, there are two flags related to color encoding control:

- **`dai_type`**:
ptoupas marked this conversation as resolved.
Show resolved Hide resolved
Provides a more comprehensive control over the input type compatible with the DAI backend. It is read by DepthAI to automatically configure the processing pipeline, including any necessary modifications to the input image format.
- **`reverse_channels` (Deprecated)**:
Determines the input color format of the model: when set to *True*, the input is considered to be *"RGB"*, and when set to *False*, it is treated as *"BGR"*. This flag is deprecated and will be replaced by the `dai_type` flag in future versions.

> [!NOTE]
> If neither `dai_type` nor `reverse_channels` the input to the model is considered to be *"RGB"*.

> [!NOTE]
> If both `dai_type` and `reverse_channels` are provided, the converter will give priority to `dai_type`.

> [!IMPORTANT]
> Provide mean/scale values in the original color format used during model training (e.g., RGB or BGR). Any necessary channel permutation is handled internally—do not reorder values manually.

### Sharing Files

Expand Down Expand Up @@ -158,6 +212,7 @@ You can run the built image either manually using the `docker run` command or us
1. Execute the conversion:

- If using the `docker run` command:

```bash
docker run --rm -it \
-v $(pwd)/shared_with_container:/app/shared_with_container/ \
Expand All @@ -168,11 +223,15 @@ You can run the built image either manually using the `docker run` command or us
convert <target> \
--path <s3_url_or_path> [ config overrides ]
```

- If using the `modelconverter` CLI:

```bash
modelconverter convert <target> --path <s3_url_or_path> [ config overrides ]
```

- If using `docker-compose`:

```bash
docker compose run <target> convert <target> ...
```
Expand Down Expand Up @@ -200,13 +259,17 @@ Specify all options via the command line without a config file:
```bash
modelconverter convert rvc2 input_model models/yolov6n.onnx \
scale_values "[255,255,255]" \
reverse_input_channels True \
inputs.0.encoding.from RGB \
inputs.0.encoding.to BGR \
shape "[1,3,256,256]" \
outputs.0.name out_0 \
outputs.1.name out_1 \
outputs.2.name out_2
```

> [!WARNING]
> If you modify the default stages names (`stages.stage_name`) in the configuration file (`config.yaml`), you need to provide the full path to each stage in the command-line arguments. For instance, if a stage name is changed to `stage1`, use `stages.stage1.inputs.0.name` instead of `inputs.0.name`.

## Multi-Stage Conversion

The converter supports multi-stage conversion. This means conversion of multiple
Expand Down
33 changes: 25 additions & 8 deletions modelconverter/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
MODELS_DIR,
OUTPUTS_DIR,
)
from modelconverter.utils.types import Target
from modelconverter.utils.types import DataType, Encoding, Target

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -114,7 +114,7 @@ class Format(str, Enum):
DevOption: TypeAlias = Annotated[
bool,
typer.Option(
help="Builds a new iamge and uses the development docker-compose file."
help="Builds a new image and uses the development docker-compose file."
),
]

Expand Down Expand Up @@ -212,18 +212,31 @@ def extract_preprocessing(
for inp in stage_cfg.inputs:
mean = inp.mean_values or [0, 0, 0]
scale = inp.scale_values or [1, 1, 1]
encoding = inp.encoding
layout = inp.layout

dai_type = encoding.to.value
if dai_type != "NONE":
if inp.data_type == DataType.FLOAT16:
type = "F16F16F16"
else:
type = "888"
dai_type += type
dai_type += "i" if layout == "NHWC" else "p"

preproc_block = PreprocessingBlock(
reverse_channels=inp.reverse_input_channels,
mean=mean,
scale=scale,
interleaved_to_planar=False,
reverse_channels=encoding.to == Encoding.RGB,
interleaved_to_planar=layout == "NHWC",
dai_type=dai_type,
)
preprocessing[inp.name] = preproc_block

inp.mean_values = None
inp.scale_values = None
inp.reverse_input_channels = False
inp.encoding.from_ = Encoding.NONE
inp.encoding.to = Encoding.NONE

return cfg, preprocessing

Expand Down Expand Up @@ -477,9 +490,13 @@ def convert(
archive_cfg,
preprocessing,
main_stage,
exporter.inference_model_path
if isinstance(exporter, Exporter)
else exporter.exporters[main_stage].inference_model_path,
(
exporter.inference_model_path
if isinstance(exporter, Exporter)
else exporter.exporters[
main_stage
].inference_model_path
),
)
generator = ArchiveGenerator(
archive_name=f"{cfg.name}.{target.value.lower()}",
Expand Down
2 changes: 1 addition & 1 deletion modelconverter/packages/hailo/exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ def _get_alls(self, runner: ClientRunner) -> str:
f"{mean_values},{scale_values},{hn_name})"
)

if inp.reverse_input_channels:
if inp.encoding_mismatch:
alls.append(
f"bgr_to_rgb_{safe_name} = input_conversion("
f"{hn_name},bgr_to_rgb)"
Expand Down
42 changes: 30 additions & 12 deletions modelconverter/packages/rvc2/exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,25 +109,43 @@ def _export_openvino_ir(self) -> Path:
reverse_only=True,
)
for inp in self.inputs.values():
if inp.mean_values is not None and inp.reverse_input_channels:
if inp.mean_values is not None and inp.encoding_mismatch:
inp.mean_values = inp.mean_values[::-1]
if inp.scale_values is not None and inp.reverse_input_channels:
if inp.scale_values is not None and inp.encoding_mismatch:
inp.scale_values = inp.scale_values[::-1]
inp.reverse_input_channels = False
inp.encoding.from_ = Encoding.BGR
inp.encoding.to = Encoding.BGR
klemen1999 marked this conversation as resolved.
Show resolved Hide resolved

mean_values_str = ""
scale_values_str = ""
for name, inp in self.inputs.items():
# Append mean values in a similar style
if inp.mean_values is not None:
self._add_args(
args,
["--mean_values", f"{name}{_lst_join(inp.mean_values)}"],
if mean_values_str:
mean_values_str += ","
mean_values_str += (
f"{name}[{', '.join(str(v) for v in inp.mean_values)}]"
)

# Append scale values in a similar style
if inp.scale_values is not None:
self._add_args(
args,
["--scale_values", f"{name}{_lst_join(inp.scale_values)}"],
if scale_values_str:
scale_values_str += ","
scale_values_str += (
f"{name}[{', '.join(str(v) for v in inp.scale_values)}]"
)
if inp.reverse_input_channels:
self._add_args(args, ["--reverse_input_channels"])
# Extend args with mean and scale values if they were collected
if mean_values_str:
args.extend(["--mean_values", mean_values_str])
if scale_values_str:
args.extend(["--scale_values", scale_values_str])

# Append reverse_input_channels flag only once if needed
reverse_input_flag = any(
inp.encoding_mismatch for inp in self.inputs.values()
)
if reverse_input_flag:
args.append("--reverse_input_channels")

self._add_args(args, ["--input_model", self.input_model])

Expand All @@ -137,7 +155,7 @@ def _export_openvino_ir(self) -> Path:
return self.input_model.with_suffix(".xml")

def _check_reverse_channels(self):
reverses = [inp.reverse_input_channels for inp in self.inputs.values()]
reverses = [inp.encoding_mismatch for inp in self.inputs.values()]
return all(reverses) or not any(reverses)

@staticmethod
Expand Down
Loading