No student devices needed. Know more
6 questions
Is the following statement True or False? A neural network’s weights can be randomly initialised as gradient descent will always eventually find the optimal set of parameters, therefore being invariant to the initial set of parameters
True
False
L1 regularisation favours few non zero weights where as L2 regularisation favours small values around zero.
True
False
Which of the following statements about dropout are correct?
Dropout prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors.
Dropout is more similar to L2 regularisation than L1 regularisation during training.
Dropout is active during training and testing.
Dropout can be viewed as a form of ensemble learning
The amount of dropout, p, can be optimised through standard stochastic gradient descent (SGD) methods
A dense multi layer perceptron layer has 300 input values and 200 output values. Assuming no bias, how many parameters does the layer contain?
500
250
6000
60000
3000
If a neural network is overfitting, which of the following would not help
Introducing dropout
Reducing the number of layers in the model
Increasing the learning rate
Increasing the size of the training data
None of the above
Which of the following statements are True?
Minibatch Gradient Descent (GD) updates the network based on an expectation of the gradient of the parameter space at that point.
Minibatch GD updates the network based on the exact gradient of the parameter space at that point.
The computational cost of batch GD and minibatch GD is the same.
The computation cost of batch GD is larger than minibatch GD.
Explore all questions with a free account