What does GMM-EM optimise?

Minimises the negative-log-likelihood of the model

Minimises the average distance between the samples and the mean of the nearest Gaussian

Maximises the negative-log-likelihood of the mode

Maximises the classification rate

Intro to ML: Unsupervised Learning

Authored by Josiah Wang

Mathematics, Computers, Fun

University

Used 14+ times

AI Actions

Add similar questions

Adjust reading levels

Convert to real-world scenario

Translate activity

More...

Content View

Student View

10 questions

Show all answers

MULTIPLE CHOICE QUESTION

1 min • 1 pt

Does one expect two runs of k-means clustering to produce the same clustering results?

yes

Answer explanation

No, k-means is sensitive to the initialisation stage where centroids are randomly assigned to positions in the data space.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

Is it possible that the assignment of observations to clusters doesn’t change between successive iterations in K-Means?

yes

can't say

Answer explanation

Yes! Each centroid is updated to the average position of the datapoints which were assigned to it in the previous iteration. If the previous update in centroid position did not result in new datapoints being assigned to it then it's position will not be updated.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

True or False. The larger the number of centroids in K-means, the less likely the model is to overfit

True

False

Answer explanation

If you keep increasing the number of centroids, at some point K will equal the number of data points. This will result in each data instance being assigned its own unique cluster. You will be fitting the spurious noise, not the underling trend of the data! The challenge with k-means is picking the correct number of centroids for the problem.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

True or False. The initial position of the clusters does not affect the final result of K-Means

True

False

Answer explanation

As the centroids are simply updated to the average position of the assigned clusters there is no guarantee of convergence on a global optimum. Rather convergence on local minima subject to cluster initialisation occurs.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

A student has applied the k-means algorithm to an unsupervised problem. On analysis they find that the mean distance between data instances and the cluster centres which they are assigned is 0. What does this mean?

That the chosen value of k must equal the true number of clusters

That the chosen value of k must at least equal the number of datapoints

That this specific configuration (ie position) of k centroids is optimal for this dataset

None of these

Answer explanation

Assuming that there are no datapoints with identical attributes there will always be a positive mean distance between clusters and their assigned datapoints if a centroid has more than one data point assigned to it.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

The K-means algorithm was executed several times with different values of K. The mean distance between validation datapoints and the nearest centroid was calculated and plotted. From this plot determine the best value for K.

Answer explanation

Check out the 'Elbow' method in the slides. The sharp plateauing of the decline score with increasing number of K suggests the point where you stop modelling the true underlying clusters of the data and start to model noise.

MULTIPLE SELECT QUESTION

45 sec • 1 pt

Which of the following are limitations of the k-means algorithm

It is sensitive to outliers

It is sensitive to initialisation

It has exponential time complexity with dataset size

It is not suitable for datasets containing non hyper-ellipsoids clusters

None of the above

Answer explanation

Check the slides!

Access all questions and much more by creating a free account

Create resources

Host any resource

Get auto-graded reports

Continue with Google

Continue with Email

Continue with Classlink

Continue with Clever

or continue with

Microsoft

Apple

Others

Already have an account?

Similar Resources on Wayground

15 questions

Minecraft 1.16

Quiz

•

KG - Professional Dev...

10 questions

Software Testing

Quiz

•

University

11 questions

Fun riddles

Quiz

•

KG - Professional Dev...

10 questions

C++ Array Quiz

Quiz

•

University

15 questions

University of Salford AVD

Quiz

•

University

15 questions

Healthy Food

Quiz

•

KG - Professional Dev...

10 questions

Node.js

Quiz

•

University

10 questions

RUN BTS!

Quiz

•

7th Grade - University

Popular Resources on Wayground

7 questions

History of Valentine's Day

Interactive video

•

4th Grade

15 questions

Fractions on a Number Line

Quiz

•

3rd Grade

20 questions

Equivalent Fractions

Quiz

•

3rd Grade

25 questions

Multiplication Facts

Quiz

•

5th Grade

$fractions$

22 questions

fractions

Quiz

•

3rd Grade

15 questions

Valentine's Day Trivia

Quiz

•

3rd Grade

20 questions

Main Idea and Details

Quiz

•

5th Grade

20 questions

Context Clues

Quiz

•

6th Grade

Discover more resources for Mathematics

10 questions

Add & Subtract Mixed Numbers with Like Denominators

Quiz

•

KG - University

7 questions

Introduction to Fractions

Interactive video

•

1st Grade - University

28 questions

Parallel lines and Transversals

Quiz

•

9th Grade - University

16 questions

Parallel, Perpendicular, and Intersecting Lines

Quiz

•

KG - Professional Dev...

Intro to ML: Unsupervised Learning

Does one expect two runs of k-means clustering to produce the same clustering results?

No, k-means is sensitive to the initialisation stage where centroids are randomly assigned to positions in the data space.

Is it possible that the assignment of observations to clusters doesn’t change between successive iterations in K-Means?

Yes! Each centroid is updated to the average position of the datapoints which were assigned to it in the previous iteration. If the previous update in centroid position did not result in new datapoints being assigned to it then it's position will not be updated.

True or False. The larger the number of centroids in K-means, the less likely the model is to overfit

True or False. The initial position of the clusters does not affect the final result of K-Means

As the centroids are simply updated to the average position of the assigned clusters there is no guarantee of convergence on a global optimum. Rather convergence on local minima subject to cluster initialisation occurs.

A student has applied the k-means algorithm to an unsupervised problem. On analysis they find that the mean distance between data instances and the cluster centres which they are assigned is 0. What does this mean?

Assuming that there are no datapoints with identical attributes there will always be a positive mean distance between clusters and their assigned datapoints if a centroid has more than one data point assigned to it.

The K-means algorithm was executed several times with different values of K. The mean distance between validation datapoints and the nearest centroid was calculated and plotted. From this plot determine the best value for K.

Check out the 'Elbow' method in the slides. The sharp plateauing of the decline score with increasing number of K suggests the point where you stop modelling the true underlying clusters of the data and start to model noise.

Which of the following are limitations of the k-means algorithm

Check the slides!

What does GMM-EM optimise?

Check definitions in slides

True or False. If the responsibility, r_nk is high, it means that data point n is a plausible sample from the kth mixture

Responsibilities define the probability of each data point belonging to each cluster. Remember GMM-EM is a soft assignment!

True or False? The only differences between GMM-EM and k-means is the non-isotropic distance to the centroids/means and that for GMM-EM this metric varies during the learning process.

k-means is a hard assignment where as GMM-EM is a soft assignment. Every point belongs to all clusters corresponding to the responsibility.

Access all questions and much more by creating a free account

Similar Resources on Wayground

Popular Resources on Wayground

Discover more resources for Mathematics

Intro to ML: Unsupervised Learning

Does one expect two runs of k-means clustering to produce the same clustering results?

No, k-means is sensitive to the initialisation stage where centroids are randomly assigned to positions in the data space.

Is it possible that the assignment of observations to clusters doesn’t change between successive iterations in K-Means?

Yes! Each centroid is updated to the average position of the datapoints which were assigned to it in the previous iteration. If the previous update in centroid position did not result in new datapoints being assigned to it then it's position will not be updated.

True or False. The larger the number of centroids in K-means, the less likely the model is to overfit

True or False. The initial position of the clusters does not affect the final result of K-Means

As the centroids are simply updated to the average position of the assigned clusters there is no guarantee of convergence on a global optimum. Rather convergence on local minima subject to cluster initialisation occurs.

A student has applied the k-means algorithm to an unsupervised problem. On analysis they find that the mean distance between data instances and the cluster centres which they are assigned is 0. What does this mean?

Assuming that there are no datapoints with identical attributes there will always be a positive mean distance between clusters and their assigned datapoints if a centroid has more than one data point assigned to it.

The K-means algorithm was executed several times with different values of K. The mean distance between validation datapoints and the nearest centroid was calculated and plotted. From this plot determine the best value for K.

Check out the 'Elbow' method in the slides. The sharp plateauing of the decline score with increasing number of K suggests the point where you stop modelling the true underlying clusters of the data and start to model noise.

Which of the following are limitations of the k-means algorithm

Check the slides!

What does GMM-EM optimise?

Check definitions in slides

True or False. If the responsibility, rnk is high, it means that data point n is a plausible sample from the kth mixture

Responsibilities define the probability of each data point belonging to each cluster. Remember GMM-EM is a soft assignment!

True or False? The only differences between GMM-EM and k-means is the non-isotropic distance to the centroids/means and that for GMM-EM this metric varies during the learning process.

k-means is a hard assignment where as GMM-EM is a soft assignment. Every point belongs to all clusters corresponding to the responsibility.

Access all questions and much more by creating a free account

Similar Resources on Wayground

Popular Resources on Wayground

Discover more resources for Mathematics

True or False. If the responsibility, r_nk is high, it means that data point n is a plausible sample from the kth mixture