Suggestions for you

Quiz image

15 Qs

Phonics

8K plays

KG - 1st

Quiz image

10 Qs

Try Your Mind

2K plays

KG

Quiz image

10 Qs

Tricky Words

2.8K plays

KG

Quiz image

15 Qs

Parts of Body

2.4K plays

KG

Quiz image

10 Qs

Math

44.7K plays

KG

Quiz image

13 Qs

Pronouns

10.2K plays

7th - 10th

Quiz image

10 Qs

Sand

210 plays

KG

Quiz image

13 Qs

Language Day

6.4K plays

1st - 3rd

Build your own quiz

Explore millions of free instructional resources

QUIZ

University

55%

accuracy

1K

plays

User image

Shakirah Bt M Taib

5 years

University

Clustering

user

Shakirah Bt M Taib

1K
plays

20 questions

20 questions

Show Answers

See Preview

1. Multiple Choice
30 seconds
1 pt
The goal of clustering a set of data is to
divide them into groups of data that are near each other
choose the best data from the set
determine the nearest neighbors of each of the data
predict the class of data
2. Multiple Choice
30 seconds
1 pt
The k-means algorithm...
always converges to a clustering that minimizes the mean-square vector-representative distance
can converge to different final clustering, depending on initial choice of representatives
is widely used in practice
is typically done by hand, using paper and pencil
should only be attempted by trained professionals
3. Multiple Choice
30 seconds
1 pt
The choice of k, the number of clusters to partition a set of data into,...
is a personal choice that shouldn't be discussed in public
depends on why you are clustering the data
should always be as large as your computer system can handle
has maximum 10
4. Multiple Choice
30 seconds
1 pt
Which of the following statements about the K-means algorithm are correct?
The K-means algorithm is sensitive to outliers.
For different initializations, the K-means algorithm will definitely give the same clustering results.
The centroids in the K-means algorithm may not be any observed data points.
The K-means algorithm can detect non-convex clusters.
5. Multiple Choice
30 seconds
1 pt
Considering the K-median algorithm, if points (0, 3), (2, 1), and (-2, 2) are the only points which are assigned to the first cluster now, what is the new centroid for this cluster?
(0,2)
(2,1)
(2,0)
(1,2)
6. Multiple Choice
1 minute
1 pt
Considering the K-means algorithm, after current iteration, we have 3 centroids (0, 1) (2, 1), (-1, 2). Will points (2, 3) and (2, 0.5) be assigned to the same cluster in the next iteration?
Yes
No
7. Multiple Choice
1 minute
1 pt
The Iris dataset contains information about Iris setosa and versicolor. What is the Euclidean distance between these two objects?
2.8
4.6
22.6
-3.6
8. Multiple Choice
30 seconds
1 pt
Which of the following statements are true?
Graphs, time-series data, text, and multimedia data are all examples of data types on which cluster analysis can be performed.
Agglomerative clustering is an example of a hierarchical and distance-based clustering method.
When dealing with high-dimensional data, we sometimes consider only a subset of the dimensions when performing cluster analysis.
We can only visualize the clustering results when the data is 2-dimensional.
9. Multiple Choice
30 seconds
1 pt
Which of the following statements are true?
Clustering analysis in unsupervised learning since it does not require labeled training data.
It is impossible to cluster objects in a data stream. We must have all the data objects that we need to cluster ready before clustering can be performed.
Clustering analysis has a wide range of applications in tasks such as data summarization, dynamic trend detection, multimedia analysis, and biological network analysis.
When clustering, we want to put two dissimilar data objects into the same cluster.
10. Multiple Choice
30 seconds
1 pt
What are some common considerations and requirements for cluster analysis?
We need to consider how to incorporate user preference for cluster size and shape into the clustering algorithm.
In order to perform cluster analysis, we need to have a similarity measure between data objects.
We need to be able to handle a mixture of different types of attributes (e.g., numerical, categorical).
We must know the number of output clusters a priori for all clustering algorithms.
11. Multiple Choice
30 seconds
1 pt
What are the two types of Hierarchical Clustering
Top-Down Clustering (Divisive)
Bottom-Top Clustering (Agglomerative)
Dendrogram
K-means
12. Multiple Choice
30 seconds
1 pt
What is a Dendrogram?
A tree diagram used to illustrate the arrangement of clusters in hierarchical clustering.
A tree diagram used to illustrate the arrangement of clusters in partitional clustering.
A type of hierarchical clustering.
A type of bar chart diagram to visualize k-means clusters.
13. Multiple Choice
30 seconds
1 pt
The most important part of _____ is selecting the variables on which clustering is based.
interpreting and profiling clusters
selecting a clustering procedure
assessing the validity of clustering
formulating the clustering problem
14. Multiple Choice
30 seconds
1 pt
The most commonly used measure of similarity is the _____ or its square.
euclidean distance
city-block distance
Chebychev’s distance
Manhattan distance
15. Multiple Choice
30 seconds
1 pt
_____ is a clustering procedure where all objects start out in one giant cluster. Clusters are formed by dividing this cluster into smaller and smaller clusters.
Non-hierarchical clustering
Divisive clustering
Agglomerative clustering
K-means clustering
16. Multiple Choice
30 seconds
1 pt
Which of the following is required by K-means clustering?
defined distance metric
number of clusters
initial guess as to cluster centroids
all answers are correct
17. Multiple Choice
30 seconds
1 pt
In the figure above, if you draw a horizontal line on y-axis for y=2. What will be the number of clusters formed?
2
4
3
5
18. Multiple Choice
30 seconds
1 pt
For which of the following tasks might clustering be a suitable approach?
Given sales data from a large number of products in a supermarket, estimate future sales for each of these products.
Given a database of information about your users, automatically group them into different market segments.
From the user's usage patterns on a website, identify different user groups.
Given historical weather records, predict if tomorrow's weather will be sunny or rainy.
19. Multiple Choice
30 seconds
1 pt
K-means is an iterative algorithm, and two of the following steps are repeatedly carried out in its inner-loop. Which two?
Assign each point to its nearest cluster
Test on the cross-validation set
Update the cluster centroids based the current assignment
Using the elbow method to choose K
20. Multiple Choice
30 seconds
1 pt
Clustering should be done on samples of 300 or more.
False
True
Answer choices
Tags
Answer choices
Tags
Explore all questions with a free account
Already have an account?

Suggestions for you

Quiz image

15 Qs

Phonics

8K plays

KG - 1st

Quiz image

10 Qs

Try Your Mind

2K plays

KG

Quiz image

10 Qs

Tricky Words

2.8K plays

KG

Quiz image

15 Qs

Parts of Body

2.4K plays

KG

Quiz image

10 Qs

Math

44.7K plays

KG

Quiz image

13 Qs

Pronouns

10.2K plays

7th - 10th

Quiz image

10 Qs

Sand

210 plays

KG

Quiz image

13 Qs

Language Day

6.4K plays

1st - 3rd