Skip to content

Benchmarks on different kmeans implementations (R/Julia; algorithm variants; initialization)

License

Notifications You must be signed in to change notification settings

szcf-weiya/KmeansBenchmarks.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KmeansBenchmarks.jl

CI

This project seeks to systematically benchmark and compare k-means implementations across the following aspects:

  • Software ecosystem: R (e.g., stats, ClusterR) vs Julia (e.g., Clustering)
  • Algorithm variants: Variants like Lloyd’s, Hartigan-Wong
  • Initialization: Random seeding, k-means++

We evaluate the performance from three main metrics:

  • Clustering accuracy
  • Ratio of the Between-sum-of-squares / Total-sum-of-squares
  • Computational time

Image

💫 You can check the interactive Plotly figures at https://hohoweiya.xyz/KmeansBenchmarks.jl

This work aims to provide actionable insights for researchers and practitioners in selecting optimal k-means configurations tailored to their data size, dimensionality, and domain requirements.

About

Benchmarks on different kmeans implementations (R/Julia; algorithm variants; initialization)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages