The AclNet
model is designed to perform sound classification and is trained on internal dataset of environmental sounds for 53 different classes, listed in file <omz_dir>/data/dataset_classes/aclnet_53cl.txt
.
For details about the model, see this paper.
The model input is a segment of PCM audio samples in N, C, 1, L
format.
The model output for AclNet
is the sound classifier output for the 53 different environmental sound classes from the internal sound database.
Metric | Value |
---|---|
Type | Classification |
GFLOPs | 1.42 |
MParams | 2.71 |
Source framework | PyTorch* |
Metric | Value |
---|---|
Top 1 | 86.3% |
Top 5 | 92.0% |
Metrics were computed on internal validation dataset according to following publication and paper.
Audio, name - input
, shape - 1, 1, 1, L
, format is N, C, 1, L
, where:
N
- batch sizeC
- channelL
- number of PCM samples (minimum value is 16000)
Audio, name - input
, shape - 1, 1, 1, L
, format is N, C, 1, L
, where:
N
- batch sizeC
- channelL
- number of PCM samples (minimum value is 16000)
Sound classifier (see labels file, <omz_dir>/data/dataset_classes/aclnet_53cl.txt
), name - output
, shape - 1, 53
, output data format is N, C
, where:
N
- batch sizeC
- predicted softmax scores for each class in [0, 1] range
Sound classifier (see labels file, <omz_dir>/data/dataset_classes/aclnet_53cl.txt
), name - output
, shape - 1, 53
, output data format is N, C
, where:
N
- batch sizeC
- predicted softmax scores for each class in [0, 1] range
You can download models and if necessary convert them into OpenVINO™ IR format using the Model Downloader and other automation tools as shown in the examples below.
An example of using the Model Downloader:
omz_downloader --name <model_name>
An example of using the Model Converter:
omz_converter --name <model_name>
The model can be used in the following demos provided by the Open Model Zoo to show its capabilities:
The original model is distributed under Apache License, Version 2.0.