Does one expect two runs of k-means clustering to produce the same clustering results?
Intro to ML: Unsupervised Learning

Quiz
•

Josiah Wang
•
Mathematics, Computers, Fun
•
University
•
11 plays
•
Hard
10 questions
Show all answers
1.
MULTIPLE CHOICE QUESTION
1 min • 1 pt
yes
no
Answer explanation
No, k-means is sensitive to the initialisation stage where centroids are randomly assigned to positions in the data space.
2.
MULTIPLE CHOICE QUESTION
1 min • 1 pt
Is it possible that the assignment of observations to clusters doesn’t change between successive iterations in K-Means?
yes
no
can't say
Answer explanation
Yes! Each centroid is updated to the average position of the datapoints which were assigned to it in the previous iteration. If the previous update in centroid position did not result in new datapoints being assigned to it then it's position will not be updated.
3.
MULTIPLE CHOICE QUESTION
1 min • 1 pt
True or False. The larger the number of centroids in K-means, the less likely the model is to overfit
True
False
Answer explanation
If you keep increasing the number of centroids, at some point K will equal the number of data points. This will result in each data instance being assigned its own unique cluster. You will be fitting the spurious noise, not the underling trend of the data! The challenge with k-means is picking the correct number of centroids for the problem.
4.
MULTIPLE CHOICE QUESTION
1 min • 1 pt
True or False. The initial position of the clusters does not affect the final result of K-Means
True
False
Answer explanation
As the centroids are simply updated to the average position of the assigned clusters there is no guarantee of convergence on a global optimum. Rather convergence on local minima subject to cluster initialisation occurs.
5.
MULTIPLE CHOICE QUESTION
1 min • 1 pt
A student has applied the k-means algorithm to an unsupervised problem. On analysis they find that the mean distance between data instances and the cluster centres which they are assigned is 0. What does this mean?
That the chosen value of k must equal the true number of clusters
That the chosen value of k must at least equal the number of datapoints
That this specific configuration (ie position) of k centroids is optimal for this dataset
None of these
Answer explanation
Assuming that there are no datapoints with identical attributes there will always be a positive mean distance between clusters and their assigned datapoints if a centroid has more than one data point assigned to it.
6.
MULTIPLE CHOICE QUESTION
1 min • 1 pt
The K-means algorithm was executed several times with different values of K. The mean distance between validation datapoints and the nearest centroid was calculated and plotted. From this plot determine the best value for K.
1
3
4
6
9
Answer explanation
Check out the 'Elbow' method in the slides. The sharp plateauing of the decline score with increasing number of K suggests the point where you stop modelling the true underlying clusters of the data and start to model noise.
7.
MULTIPLE SELECT QUESTION
45 sec • 1 pt
Which of the following are limitations of the k-means algorithm
It is sensitive to outliers
It is sensitive to initialisation
It has exponential time complexity with dataset size
It is not suitable for datasets containing non hyper-ellipsoids clusters
None of the above
Answer explanation
Check the slides!
8.
MULTIPLE CHOICE QUESTION
45 sec • 1 pt
What does GMM-EM optimise?
Minimises the average distance between the samples and the mean of the nearest Gaussian
Minimises the negative-log-likelihood of the model
Maximises the negative-log-likelihood of the mode
Maximises the classification rate
None of these
Answer explanation
Check definitions in slides
9.
MULTIPLE CHOICE QUESTION
1 min • 1 pt
True or False. If the responsibility, rnk is high, it means that data point n is a plausible sample from the kth mixture
True
False
Answer explanation
Responsibilities define the probability of each data point belonging to each cluster. Remember GMM-EM is a soft assignment!
10.
MULTIPLE CHOICE QUESTION
45 sec • 1 pt
True or False? The only differences between GMM-EM and k-means is the non-isotropic distance to the centroids/means and that for GMM-EM this metric varies during the learning process.
True
False
Answer explanation
k-means is a hard assignment where as GMM-EM is a soft assignment. Every point belongs to all clusters corresponding to the responsibility.
Explore all questions with a free account
Popular Resources on Quizizz
17 questions
CAASPP Math Practice 3rd

Quiz
•
3rd Grade
15 questions
Grade 3 Simulation Assessment 1

Quiz
•
3rd Grade
20 questions
math review

Quiz
•
4th Grade
19 questions
HCS Grade 5 Simulation Assessment_1 2425sy

Quiz
•
5th Grade
16 questions
Grade 3 Simulation Assessment 2

Quiz
•
3rd Grade
21 questions
6th Grade Math CAASPP Practice

Quiz
•
6th Grade
13 questions
Cinco de mayo

Interactive video
•
6th - 8th Grade
20 questions
Reading Comprehension

Quiz
•
5th Grade
Discover more resources for Mathematics
12 questions
Scientific Notation

Quiz
•
University
40 questions
8th Grade Math Review

Quiz
•
8th Grade - University
14 questions
(5-3) 710 Mean, Median, Mode & Range Quick Check

Quiz
•
6th Grade - University
8 questions
2 Step Word Problems

Quiz
•
KG - University
20 questions
Math EOG Review

Quiz
•
KG - University
9 questions
BUDGETING

Lesson
•
University
18 questions
Plotting Points on the Coordinate Plane

Quiz
•
KG - University
28 questions
3rd Grade Math Review

Quiz
•
KG - University