11 questions
If we predict every observation to be True, what will our model precision be?
100%
0%
The proportion of True values in the dataset
Not enough information
James, Amelia, and George are participating in a machine learning competition. They have to choose an algorithm for their project. Select which of the following algorithms they should consider if they want to use eager learners:
K-nearest neighbours
Decision trees
Neural networks
Linear regression
Which of the following statements are True:
Performance on the validation set can be used to see if a model is overfitting to the training data
We cannot tell from the training performance alone if a model is overfitting or not
Underfitting implies better generalisation to other datasets
Scarlett is working on a machine learning project and she is worried about underfitting. Which of the following actions may cause underfitting in her model?
Reducing the max. depth of a decision tree
Increasing the value of K in K-nn
Adding more layers to a neural network
Increasing the size of the training data
Increasing the value of K in K-means
True or False:
If we use grid-search for testing different hyper-parameter values, we can use each of these results for finding the confidence interval of the model error.
True
False
Which of the following algorithms will change given different random seeds:
Neural networks
K-nearest neighbours (K = 1, with no ties)
Decision trees
K-means
Evolution Algorithms using simple tournament
Which statements below are True describing the differences between Gradient Descent, Stochastic Gradient Descent and Mini-batched Gradient Descent:
Gradient Descent is faster to compute than Stochastic Gradient Descent
Stochastic Gradient Descent is faster to compute than Mini-batched Gradient Descent
There is less noise in the gradients when using Mini-batched Gradient Descent compared to Stochastic Gradient Descent
Which of the following statements about K-means are True:
The algorithm always converges
The algorithm always converges to a global optimum
The algorithm doesn’t always converge
If the algorithm does converge, it will converge to a global optimum
For a Gaussian Mixture Model, which of the following statements are True:
The responsibilities r_ik for ith data point sum to 1
The responsibilities r_ni for the ith mixture component sum to 1
None of the above are True
Which of one the following is correct?
a) sigmoid b) tanh c) ReLU
a) tanh b) sigmoid c) Linear
a) sigmoid b) tanh c) Linear
a) softmax b) tanh c) ReLU
None of the above
Which of the following functions are not suitable candidates for the activation functions of a neural network’s hidden layers?
b
c
d
e
f