Please find my Data Science course assessment in this repository.
The python script is implementing the k-means clustering algorithm using 20 (x / y) points labeled either blue or red. The blue and red group is assign in such a way that the points are divided in two separate groups. Five tasks were given to be completed.
About the process and timing: I was working about 7h on this assessment.
- 3h research and trial on how to store the data in a useful way. Trying out different data types (lists, tuples, numpy data types, ...) and decided to use a tuple to initially store the data and splitting them into different lists for further processing.
- 1h implementing task 1 & task 2
- 1h research and implementation of Euclidian distance / task 4
- 0,5h re-labeling the data points
- 1h wrapping results into separate functions to fulfill task5, repetition of task 2-4. Still work-in-progress!
- 0,5h documentation