Intro to ML: Neural Networks Lecture 2 Part 1

Assessment
•

Josiah Wang
•
Mathematics, Computers, Fun
•
University
•
42 plays
•
Hard
Student preview

6 questions
Show all answers
1.
MULTIPLE CHOICE
30 sec • 1 pt
Mean squared error is a common loss function for which task?
Answer explanation
Mean squared error is the expected (squared) distance between the models predictions and the true values. It therefore is well suited to regression tasks where the label space is a continuous one. MSE assumes the data is normally distributed however, in binary classification tasks, the data is distributed according to a Bernoulli distribution.
2.
MULTIPLE CHOICE
30 sec • 1 pt
Is the following statement True or False? Multi-class and Multi-label classification are the same thing.
Answer explanation
Multiclass classification - classification task where each instance needs to be assigned to one of two or more classes.
Mulitlabel classification - assign each instance a set of target labels. Think of this as predicting a series of properties of an instance which are not mutually exclusive.
3.
MULTIPLE CHOICE
2 mins • 1 pt
The shape of the weight matrix, W of a Neural network linear layer is (x,y). A forward pass through this layer can be represented as follows:
Z=XW
Where X is the batched data of dimensions (batch size, input features) and W is the weight matrix. Select the correct assignments for (x, y)
Answer explanation
Every neuron in a fully connected neural layer is connected to every input via a weight. This creates a vector of dimensions equal to the input layer dimensions for every neuron in the current layer. Resulting in a stack of these vectors as deep as the number of neurons in the current layer. For Z=XW the neighbouring dimensions need to match.
4.
MULTIPLE CHOICE
30 sec • 1 pt
If a neural network has a single output neuron, then the model may be used for:
5.
MULTIPLE CHOICE
5 mins • 1 pt
Here we have a computational graph representing a series of operations. The green text (text above the line) represents the forward pass (i.e. the values at each stage in the graph during forward propagation). The red values (the values which are positioned underneath the lines) represent gradient signals which have been passed back down the computational graph after some loss as been calculated. Calculate the missing gradients a, b and c using backpropagation (slides 20-31 onwards).
Answer explanation
The key with this computational graph is to apply the chain rule at each node. Please see the attached solution for more information.
6.
MULTIPLE SELECT
3 mins • 1 pt
Which of the following are correct statements about the models which created the plots i) and ii)?
Answer explanation
i) is demonstrating high bias as 1.) the validation and training errors are high and at a similar level 2.) as we add more training data the training error increases. This happens as the capacity of the model limits its ability to fit all the modalities present in the larger dataset.
ii) is demonstrating high variance as the validation error is much higher than the training error. The model has overfit to spurious patterns in the training set which are not representative of the true distribution (represented by the validation set)
Find a similar activity
Create activity tailored to your needs using
Classification P1 P2

•
University
AI&ML QUIZ 1

•
University
Exploring Linear Models

•
University
Python and Data Science

•
University
OSC 2023 | AI &ML Interview Quizz

•
University
Machine learning Quiz 3

•
University
AIMLQ1

•
University
Machine Learning Internal Examination

•
University