Intro to ML: Evaluation (Part 2)

Intro to ML: Evaluation (Part 2)

Assessment

Quiz

Created by

Josiah Wang

Computers

University

11 plays

Medium

Student preview

quiz-placeholder

10 questions

Show all answers

1.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

Which of the following is not used to reduce overfitting?

Stopping the training earlier

Using a larger dataset

Reducing the complexity of the model

Using k-fold cross validation

Answer explanation

Using k-fold cross validation is not a method to reduce overfitting, as it is a technique for assessing model performance. Stopping training earlier, using a larger dataset, and reducing the complexity of the model are all strategies to prevent overfitting by limiting the model's ability to memorize the training data.

2.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

Which of the following will certainly reduce the size of the confidence interval for a model's error rate?

Increasing the number of examples in your sample

Improving your model to reduce its error rate

Answer explanation

The size of the confidence interval for a model's error rate can be reduced by increasing the number of examples in your sample. More examples provide more data points, which in turn, increases the precision of the model's error rate estimation, thereby reducing the size of the confidence interval.

3.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

Which of our data sets should we use to calculate the sample error of our model?

Train

Dev

Test

Answer explanation

The test data set should be used to calculate the sample error of the model. This is because the test set is specifically designed to evaluate the model's performance on unseen data, providing an unbiased estimate of its generalization ability. Using the train or dev sets would not give an accurate representation of the model's performance on new data.

4.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

In the special case where our model sample error is zero, which of the following is true when using the sample error formula:

The confidence interval will depend on the sample size

The confidence interval will be the same regardless of the sample size

Answer explanation

In the special case where the model sample error is zero, the size of the sample does not affect the confidence interval. This is because the error, which would normally be influenced by the sample size, is non-existent. Therefore, the confidence interval remains the same regardless of the sample size.

5.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

Which of the following is not a solution for dealing with an imbalanced dataset?

Downsample the majority class

Use several metrics, choosing the one that reflects the intended model behaviour

Use k-fold cross validation

Answer explanation

The question asks about methods to handle imbalanced datasets. Downsampling the majority class and using several metrics are both valid techniques. However, using k-fold cross validation is not a specific solution for dealing with an imbalanced dataset, making it the correct answer.

6.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

Which of the following statements is false?

Overfitted models perform better on the training data than on the test data

Overfitting can occur when learning is performed for too long

Overfitting can occur if the training set is not representative

Underfitted models always generalise well to different datasets

Answer explanation

The statement 'Underfitted models always generalise well to different datasets' is false. Underfitting occurs when a model is too simple to capture the underlying structure of the data. While overfitting models may perform better on training data and overfitting can occur with non-representative training sets or prolonged learning, underfitted models do not necessarily generalise well.

7.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

What is a common method to handle missing data in a dataset?

Remove all rows with missing values

Use a different model

Ignore the missing data

Answer explanation

Removing all rows with missing values is a common method to handle missing data, as it ensures that the analysis is based on complete cases, thus maintaining the integrity of the dataset.

8.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

Which of the following is a technique to prevent overfitting?

Adding more features to the model

Using dropout in neural networks

Increasing the learning rate

Answer explanation

Using dropout in neural networks is a regularization technique that helps prevent overfitting by randomly setting a fraction of input units to zero during training, which encourages the model to learn more robust features.

9.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

What is the purpose of using a validation set in model training?

To train the model

To test the model's performance on unseen data

To tune hyperparameters

Answer explanation

The validation set is primarily used to tune hyperparameters, allowing for adjustments based on model performance without overfitting to the training data. This helps in optimizing the model before final testing.

10.

MULTIPLE CHOICE QUESTION

1 min • 1 pt

What is the purpose of using cross-validation in model evaluation?

To increase the model's accuracy

To assess the model's performance on different subsets of the data

To reduce the model's training time

Answer explanation

Cross-validation is used to assess the model's performance on different subsets of the data. This helps ensure that the model generalizes well and is not overfitting to a specific training set.

Explore all questions with a free account

or continue with
Microsoft
Apple
Others
By signing up, you agree to our Terms of Service & Privacy Policy
Already have an account?