Skip to content

Unsupervised Clustering

DrCoffey edited this page Jul 26, 2021 · 4 revisions

The unsupervised clustering function uses k-means on perceptually relevant dimensions of the extracted contour, to place calls into a predefined number of clusters.

Each call is segmented into 12 partitions. The k-means algorithm operates on the slope and frequency of each partition, as well as the call duration.

To perform unsupervised clustering using k-means:

  1. Click "Tools > Call Classification > Unsupervised Clustering"

  2. Select the detection files to cluster OR select the saved contours

  3. After the detection files are processed, you may save the extracted contours for faster loading

  4. Choose the clustering method. ARTwarp & Variational Auto Encoders are still experimental, so k-means is currently recommended

  5. Enter the weights (relative importance) of each dimension

  6. Decide whether to use an existing model (Default is "No")

  7. Enter the number of call categories or select elbow optimization

    • Elbow optimization: Enter the maximum number of cluster to test (Default = 100)
    • Elbow optimization will cluster the data sequentially from 1-max cluster and calculate the within cluster error. At the end of this procedure it will look for the elbow of the error curve and decide on an optimal number of clusters for the users data set.
  8. Once clustering finishes, you will be prompted to save the model. This is optional.

  9. A new interface will appear, showing the clusters. This interface can also be found under "Tools > Call Classification > View Clusters"

    • Name the clusters by entering a name in the text box. Clusters with the same name will be merged upon saving.

    • View different clusters with the "Next" and "Back" buttons.

    • View more calls within a cluster with the "Next Page" and "Previous Page" buttons.

    • Reject calls by clicking on them. Calls highlighted in red will be rejected upon saving.

    • Update the call files by clicking "Save", or redo the clustering with "Redo"